[Management and Operations] Lineage Analysis Database Table

vividime-Club Show all floors Published on 2024-12-20 16:19:56 |Reading mode print Previous Topic Next Topic
1
Unresolved
Knowledge related to Lineage Analysis Database Table
reply

Using props report

Exciting comments1

vividime-Club Show all floors Published on 2024-12-20 16:20:14
1. Location of storage tables
The location for storing the lineage analysis database tables is related to whether or not the lineage analysis database is configured in [System Settings > Database Connection Configuration]. If it is configured, the lineage analysis database tables will be stored in that repository. If it is not configured, but the database is configured at [Data Space Configuration > Global File Configuration > Database Connection Configuration], it will be stored in that database. If neither database is configured, it is stored to the system's built-in database.
2. Storing table structure information
•LINEAGE_RESOURCE
Resource node table, which records the details of each node in the pedigree, the node corresponds to either a physical resource (dashboard, dataset, connection, etc.) or a virtual resource (component, datasheet, etc.).
[td]
Field Name
Data Type
Field English
Field Description
id
String varchar(128)
ID
The ID of the lineage node, UUID, uniquely identifies a resource node.
name
String varchar(128)
name (of a thing)
The name of the dataset, component, connection, dashboard, and data table.
path
String varchar(512)
trails
This field stores the physical path relative to bihome if the current node is a resource such as a dataset, connection, dashboard, etc.; database name/{schema name}/table name if the current node is a data table; dashboard ID if the current node is a component; and dataset ID if the current node is a custom SQL.
type
Integer int(11)
typology
Type: dataset, component, connection, dashboard, database table, or SQL statement.
detail_type
Integer int(11)
Detailed type
Resource specific type: specific type of dataset, component, connection, dashboard.
extra_data
String varchar(128)
Additional data
Additional attributes that hold information that will not be used as query criteria.
•For example, for the markType of the Chart component.

•LINAGE_DASHBOARD_REALTION
•LINEAGE_QUERY_RELATION
•LINEAGE_EXPERIMENT_RELATION
The node relationship table, which records the dependencies between individual nodes, where:
Dependencies between dashboards to datasets (connections) are stored in LINEAGE_DASHBOARD_RELATION;
Dependencies between datasets to connections are stored in LINEAGE_QUERY_RELATION;
Experiment-to-dataset dependencies are stored in LINEAGE_EXPERIMENT_RELATION.
[td]
Field Name
Data Type
Field English
Field Description
id
String varchar(128)
ID
Resource Node ID
depend_id
String varchar(128)
ID of the dependent resource node
The ID of the dependent resource node, which may be an indirect dependency, is stored for A->B->C->D, for A->B, A->C, A->D, B->C, B->D, C->D.
depend_type
Integer int(11)
Types of dependent resource nodes
The type of the dependent resource node.
upper_id
String varchar(128)
ID of the parent resource node
The ID of the direct parent resource node of the dependent resource as opposed to depend_id.
For A->B->C->D, the upper_id stores the id of C when storing the relationship A->D.
When child_id and upper_id are the same, the relationship is as follows:
A -> B -> C
When storing the dependencies of A and C, both child_id and upper_id are B.
When storing dependencies for A and B, both child_id and upper_id are null values.
child_id
String varchar(128)
ID of the subordinate resource node
Relative to id, through which direct child node depends on depend_id.
For A -> B -> C -> D, child_id stores the id of B when storing the A -> D relationship.
When child_id and upper_id are the same, the relationship is as follows:
A -> B -> C
When storing the dependencies of A and C, both child_id and upper_id are B.
When storing dependencies for A and B, both child_id and upper_id are null values.
type
String varchar(128)
Resource type
The type of resource node to which the id field corresponds.
update_time
Integer bigint
update time
The last update time of the data in this row, mainly used for deleting old records.

•LINEAGE_RESOURCE_COLUMN
The Column Information table, which primarily records information about columns created on the dashboard and columns created on the dataset.
[td]
Field Name
Data Type
Field English
Field Description
id
String varchar(128)
ID
The column ID, UUID, uniquely identifies a resource node.
resource_id
String varchar(128)
Resource ID
The ID of the dashboard or dataset in the LINEAGE_RESOURCE table.
owner_id
String varchar(128)
Resource ID
In the LINEAGE_RESOURCE table, the ID of the dashboard or dataset, for dashboards, a column needs to be identified piece by piece with the dataset ID. For datasets, resource_id and owner_id are equal.
col_name
String varchar(128)
column name
The name of the metadata region column for the corresponding dataset page.
col_view
String varchar(128)
column alias
Corresponds to the dataset page metadata region column alias.
data_type
Integer int(11)
data type
Column data types: integer, string, date, etc.
col_type
Integer int(11)
Column Type
Column types: normal columns, calculated columns, date expressions, split columns, space removal, value mapping, etc.
dim
Integer smallint(6)
Is it a dimension
Whether it is a dimensional column.

•LINEAGE_COLUMN_RELATION
Column Dependencies table that stores the dependencies of columns between Ulead dashboards and datasets. Column dependencies between self-service datasets.
[td]
Field Name
Data Type
Field English
Field Description
resource_id
String varchar(128)
Resource ID
The ID of the dashboard or dataset in the LINEAGE_RESOURCE table.
column_id
String varchar(128)
Column ID
Corresponds to the ID in the LINEAGE_RESOURCE_COLUMN table.
depend_type
Short integer tinyint
Types of dependencies
dependType has three values, indicating the three forms of dependency.
•1: Indicates a column that is directly dependent on a resource of a complex type (self-service, combination).
•2: Indicates an indirect dependency on a column of a dataset of a simple type.
•3: Indicates a column that depends directly on a dataset of a simple type.

•LINEAGE_SCRIPT_META_COLUMN
Used to describe the database columns that are used, database columns are the columns defined in the database table. The data is derived from SQL datasets and stored as metadata columns (BCol) in the case of physical tables and database columns parsed based on SQL in the case of SQL.
[td]
Field Name
Data Type
Field English
Field Description
id
String varchar(128)
ID
The column ID, UUID, uniquely identifies a resource node.
resource_id
String varchar(128)
Resource ID
The ID of the dashboard or dataset in the LINEAGE_RESOURCE table.
conn_id
String varchar(128)
Types of dependencies
dependType has three values, indicating the three forms of dependency.
•1: Indicates a column that is directly dependent on a resource of a complex type (self-service, combination).
•2: Indicates an indirect dependency on a column of a dataset of a simple type.
•3: Indicates a column that depends directly on a dataset of a simple type.
catalog_name
String varchar(1000)
Database name
Database name
schema_name
String varchar(1000)
Table Structure Schema Name
Table Structure Schema Name
table_name
String varchar(1000)
Table (view) name
Table (view) name
column_name
String varchar(1000)
column name
column name
variant
Integer int(11)
Is the column a parameter
Whether the column is a column containing parameters
cause
Integer int(11)
scope of action
Which clause of the SQL is used for the preceding column?

•LINEAGE_META_COLUMN_RELATION
Used to describe the dependency between metadata columns, which are columns defined on the dataset by refreshing the data, and database columns, which are columns defined in the database table.
Field Name
Data Type
Field English
Field Description
meta_col_id
String varchar(128)
Metadata Column ID
The id of the metadata column on the dataset, corresponding to the id of LINEAGE_RESOURCE_COLUMN.
sql_col_id
String varchar(128)
Data Table Column ID
The id of the column defined on the database table that corresponds to the id of LINEAGE_SCRIPT_META_COLUMN.

•LINEAGE_LOCK
When the lineage is initialized, the node that performed the initialization operation and the result of the initialization are recorded.
➢Description:
LINEAGE_LOCK is only available in the database system, not in the file system.
Field Name
Data Type
Field English
Field Description
node
string (computer science)
Node IP
C node IP address.
finished
short integer
Whether the initialization is completed
Whether the initialization is completed.
create_time
integers
Initialization start time
The time when initialization starts.

•LINEAGE_SCRIPT
SQL scripts used to record inputs in SQL datasets.
Field Name
Data Type
Field English
Field Description
resource_id
string (computer science)
Resource ID
LINEAGE_RESOURCE table, the ID of the dataset, currently only SQL datasets are supported.
type
short integer
Script type
Currently, only SQL datasets are supported, so type has only one value, 1, which means it is a SQL script.
script
character array
Script content
Contents of the SQL script entered in the SQL editor

3. Primary keys and indexes for the lineage table
[td]
Table Name
Field
Primary Key & Index Type
Index name
LINEAGE_RESOURCE
id
primary key

Type
detail_type
Name
path
unique index
resource_name_index
path
ordinary index
resource_path_index
LINE-AGE_RELATION
Id
primary key

LINE-AGE_DASHBORD_RELATION
LINE-AGE_QUERY_RELATION
LINE-AGE_CONNEC-TION_RELATION
Id
depend_id
depend_type
upper_id
child_id
primary key
db_unique_index
query_unique_index connection_unique_index
depend_id
ordinary index
{table-name}_depend_id_index
id
{tablename}_id_index
upper_id
{table-name}_upper_id_index
update_time
{tablename}_update_time_index ??
LINE-AGE_SCRIPT_META_COLUMN
resource_id
conn_id
catalog_name
schema_name
table_name
column_name
primary key

resource_id
ordinary index
resource_id_index
table_name
column_name
table_column_index
conn_id
conn_id_index
LINE-AGE_META_COLUMN_RELATION
meta_col_id
sql_col_id
primary key

LINEAGE_RESOURCE_COLUMN
id
primary key

resource_id
owner_id
col_name
unique index
column_name_index
LINEAGE_COLUMN_RELATION
resource_id
column_id
primary key

LINEAGE_LOCK
node
primary key
unique_index

4. Description of table field values
4.1 Type and detail_type field values in the LINEAGE_RESOURCE table
[td]
Type
Name
DetailType
Name
1
Dashboard
22016
Favorites
21248
Shared from others
16440
Dashboard
16642
AD-hoc analysis dashboard
16643
Vivid Dashboard
2
subassemblies
1
Chart&Table
2
Table
3
Pivot
4
Image
5
Calendar
6
Dropdown list Filter
7
Range Filter
8
Component
9
Text
10
Gauge
11
Text Parameter
14
Dropdown List Parameter
15
Tab
16
Freestyle Table
17
Form Parameter
18
Group
20
Filter Component
21
Tree Filtering
22
List Parameter
23
Commit
24
Groups (for merging components in a free  layout)
26
Web Page
27
Customized Component
28
Carousel
3
dataset
16897
SQL dataset
16905
Excel dataset
16904
Composite dataset
16910
Self-service dataset
16902
Embedded dataset
16900
Data Mart dataset
16907
Mongo dataset
16899
Custom dataset
16912
Multi-Dimension dataset
156
Neo4j dataset
2021
Streaming dataset
2023
RESTful datasets
4
physical table
18435
database table
18436
view
18438
Multidimensional dataset Cube
18442
Kylin's Cube
18439
Property View
18440
Analyze View
18441
Calculated view
0
Hana's database tables
5
customizable statement
1
SQL script
5
2
SQL Stored Procedures
3
Mongo Pipeline Statements
4
RESTful JSON Query Statements
6
connections
0
Generic
1
Oracle
2
DB2
3
SQL Server
4
MySQL
5
Derby
6
Informix
7
Sybase
8
Access
9
Vertical
10
vividime
11
Hive
12
Mongo
13
Spark
14
Postgresql
15
Hana
16
Kylin
17
Max Compute
18
Kingbase
19
Impala
20
HBase
21
Presto
22
AnalyticDB
23
Essbase
24
SAP BW
25
SSAS
27
Teradata
28
GBase
29
Transwarp
30
Sparking
32
HuaWeiCloud DWS
33
GaussDB 200
34
FusionInsight ELK
35
FusionInsight HD
36
ELASTIC SEARCH
37
GREENPLUM
40
ThunderEngine
41
GaussDB 100
42
DaMeng
43
Thunder Engine
44
Neo4j
45
Clickhouse
46
TBDS
47
TBDS OLAP
48
CTSDB
49
RESTful
8
Market Folder
8
Market Folder
9
Timed task
17154
Timed task
10
Test
20480
Test
11
Complex Dashboard
16644
Complex Dashboard
12
Web Page
17696
Web Component

4.2 Data_type field values in the LINEAGE_RESOURCE_COLUMN table
[td]
Data_type
Type
English Description
2
STRING
string (computer science)
3
BOOLEAN
boolean
4
FLOAT
floating point
5
DOUBLE
floating point
7
CHAR
character
8
BYTE
byte
9
SHORT
short integer
10
INTEGER
integer
11
LONG
long integer
12
DATE_TIME
timestamp
13
DATE
date
14
TIME
time

4.3 Value of the col_type field in the LINEAGE_RESOURCE_COLUMN table
[td]
Col_type
English Description
9
Date column
10
Calculated column
11
Dimension column (script objects)
12
Metric column (script objects)
43
Simple dynamically calculated column (script objects)
45
Customized dynamically calculated column (script objects)
90
Data binning column (range)
178
Dependent column
531
Non-radar map marker fields (script objects)
545
Dimension columns (scripted objects, chart-specific)
596
Parameter column
614
Filter column
1500
Grouping columns (script object for chart grouping)
1501
Metrics group
1502
Map column
1503
Data binning column (grouping)
1507
Date conversion column
1508
Number conversion column
1512
value mapping column
1513
Missing value populated column
1514
Space bar column
1515
Disaggregate column
1517
MDX dimension column
1518
MDX metric column
1527
Hana, Kylin metric column
1802
Dynamically calculated column
1803

reply

Using props report

Advanced mode
You need to log in before you can reply to the post login | Free registration

© 2024 VIVIDATA PTE. LTD. All Rights Reserved. Privacy Statement