1. Location of storage tables The location for storing the lineage analysis database tables is related to whether or not the lineage analysis database is configured in [System Settings > Database Connection Configuration]. If it is configured, the lineage analysis database tables will be stored in that repository. If it is not configured, but the database is configured at [Data Space Configuration > Global File Configuration > Database Connection Configuration], it will be stored in that database. If neither database is configured, it is stored to the system's built-in database. 2. Storing table structure information •LINEAGE_RESOURCE Resource node table, which records the details of each node in the pedigree, the node corresponds to either a physical resource (dashboard, dataset, connection, etc.) or a virtual resource (component, datasheet, etc.). [td] Field Name | Data Type | Field English | Field Description | id | String varchar(128) | ID | The ID of the lineage node, UUID, uniquely identifies a resource node. | name | String varchar(128) | name (of a thing) | The name of the dataset, component, connection, dashboard, and data table. | path | String varchar(512) | trails | This field stores the physical path relative to bihome if the current node is a resource such as a dataset, connection, dashboard, etc.; database name/{schema name}/table name if the current node is a data table; dashboard ID if the current node is a component; and dataset ID if the current node is a custom SQL. | type | Integer int(11) | typology | Type: dataset, component, connection, dashboard, database table, or SQL statement. | detail_type | Integer int(11) | Detailed type | Resource specific type: specific type of dataset, component, connection, dashboard. | extra_data | String varchar(128) | Additional data | Additional attributes that hold information that will not be used as query criteria. •For example, for the markType of the Chart component. |
•LINAGE_DASHBOARD_REALTION •LINEAGE_QUERY_RELATION •LINEAGE_EXPERIMENT_RELATION The node relationship table, which records the dependencies between individual nodes, where: Dependencies between dashboards to datasets (connections) are stored in LINEAGE_DASHBOARD_RELATION; Dependencies between datasets to connections are stored in LINEAGE_QUERY_RELATION; Experiment-to-dataset dependencies are stored in LINEAGE_EXPERIMENT_RELATION. [td] Field Name | Data Type | Field English | Field Description | id | String varchar(128) | ID | Resource Node ID | depend_id | String varchar(128) | ID of the dependent resource node | The ID of the dependent resource node, which may be an indirect dependency, is stored for A->B->C->D, for A->B, A->C, A->D, B->C, B->D, C->D. | depend_type | Integer int(11) | Types of dependent resource nodes | The type of the dependent resource node. | upper_id | String varchar(128) | ID of the parent resource node | The ID of the direct parent resource node of the dependent resource as opposed to depend_id. For A->B->C->D, the upper_id stores the id of C when storing the relationship A->D. When child_id and upper_id are the same, the relationship is as follows: A -> B -> C When storing the dependencies of A and C, both child_id and upper_id are B. When storing dependencies for A and B, both child_id and upper_id are null values. | child_id | String varchar(128) | ID of the subordinate resource node | Relative to id, through which direct child node depends on depend_id. For A -> B -> C -> D, child_id stores the id of B when storing the A -> D relationship. When child_id and upper_id are the same, the relationship is as follows: A -> B -> C When storing the dependencies of A and C, both child_id and upper_id are B. When storing dependencies for A and B, both child_id and upper_id are null values. | type | String varchar(128) | Resource type | The type of resource node to which the id field corresponds. | update_time | Integer bigint | update time | The last update time of the data in this row, mainly used for deleting old records. |
•LINEAGE_RESOURCE_COLUMN The Column Information table, which primarily records information about columns created on the dashboard and columns created on the dataset. [td] Field Name | Data Type | Field English | Field Description | id | String varchar(128) | ID | The column ID, UUID, uniquely identifies a resource node. | resource_id | String varchar(128) | Resource ID | The ID of the dashboard or dataset in the LINEAGE_RESOURCE table. | owner_id | String varchar(128) | Resource ID | In the LINEAGE_RESOURCE table, the ID of the dashboard or dataset, for dashboards, a column needs to be identified piece by piece with the dataset ID. For datasets, resource_id and owner_id are equal. | col_name | String varchar(128) | column name | The name of the metadata region column for the corresponding dataset page. | col_view | String varchar(128) | column alias | Corresponds to the dataset page metadata region column alias. | data_type | Integer int(11) | data type | Column data types: integer, string, date, etc. | col_type | Integer int(11) | Column Type | Column types: normal columns, calculated columns, date expressions, split columns, space removal, value mapping, etc. | dim | Integer smallint(6) | Is it a dimension | Whether it is a dimensional column. |
•LINEAGE_COLUMN_RELATION Column Dependencies table that stores the dependencies of columns between Ulead dashboards and datasets. Column dependencies between self-service datasets. [td] Field Name | Data Type | Field English | Field Description | resource_id | String varchar(128) | Resource ID | The ID of the dashboard or dataset in the LINEAGE_RESOURCE table. | column_id | String varchar(128) | Column ID | Corresponds to the ID in the LINEAGE_RESOURCE_COLUMN table. | depend_type | Short integer tinyint | Types of dependencies | dependType has three values, indicating the three forms of dependency. •1: Indicates a column that is directly dependent on a resource of a complex type (self-service, combination). •2: Indicates an indirect dependency on a column of a dataset of a simple type. •3: Indicates a column that depends directly on a dataset of a simple type.
|
•LINEAGE_SCRIPT_META_COLUMN Used to describe the database columns that are used, database columns are the columns defined in the database table. The data is derived from SQL datasets and stored as metadata columns (BCol) in the case of physical tables and database columns parsed based on SQL in the case of SQL. [td] Field Name | Data Type | Field English | Field Description | id | String varchar(128) | ID | The column ID, UUID, uniquely identifies a resource node. | resource_id | String varchar(128) | Resource ID | The ID of the dashboard or dataset in the LINEAGE_RESOURCE table. | conn_id | String varchar(128) | Types of dependencies | dependType has three values, indicating the three forms of dependency. •1: Indicates a column that is directly dependent on a resource of a complex type (self-service, combination). •2: Indicates an indirect dependency on a column of a dataset of a simple type. •3: Indicates a column that depends directly on a dataset of a simple type. | catalog_name | String varchar(1000) | Database name | Database name | schema_name | String varchar(1000) | Table Structure Schema Name | Table Structure Schema Name | table_name | String varchar(1000) | Table (view) name | Table (view) name | column_name | String varchar(1000) | column name | column name | variant | Integer int(11) | Is the column a parameter | Whether the column is a column containing parameters | cause | Integer int(11) | scope of action | Which clause of the SQL is used for the preceding column? |
•LINEAGE_META_COLUMN_RELATION Used to describe the dependency between metadata columns, which are columns defined on the dataset by refreshing the data, and database columns, which are columns defined in the database table. Field Name | Data Type | Field English | Field Description | meta_col_id | String varchar(128) | Metadata Column ID | The id of the metadata column on the dataset, corresponding to the id of LINEAGE_RESOURCE_COLUMN. | sql_col_id | String varchar(128) | Data Table Column ID | The id of the column defined on the database table that corresponds to the id of LINEAGE_SCRIPT_META_COLUMN. |
•LINEAGE_LOCK When the lineage is initialized, the node that performed the initialization operation and the result of the initialization are recorded. ➢Description: LINEAGE_LOCK is only available in the database system, not in the file system. Field Name | Data Type | Field English | Field Description | node | string (computer science) | Node IP | C node IP address. | finished | short integer | Whether the initialization is completed | Whether the initialization is completed. | create_time | integers | Initialization start time | The time when initialization starts. |
•LINEAGE_SCRIPT SQL scripts used to record inputs in SQL datasets. Field Name | Data Type | Field English | Field Description | resource_id | string (computer science) | Resource ID | LINEAGE_RESOURCE table, the ID of the dataset, currently only SQL datasets are supported. | type | short integer | Script type | Currently, only SQL datasets are supported, so type has only one value, 1, which means it is a SQL script. | script | character array | Script content | Contents of the SQL script entered in the SQL editor |
3. Primary keys and indexes for the lineage table [td] Table Name | Field | Primary Key & Index Type | Index name | LINEAGE_RESOURCE | id | primary key |
| Type detail_type Name path | unique index | resource_name_index | path | ordinary index | resource_path_index | LINE-AGE_RELATION | Id | primary key |
| LINE-AGE_DASHBORD_RELATION LINE-AGE_QUERY_RELATION LINE-AGE_CONNEC-TION_RELATION | Id depend_id depend_type upper_id child_id | primary key | db_unique_index query_unique_index connection_unique_index | depend_id | ordinary index | {table-name}_depend_id_index | id | {tablename}_id_index | upper_id | {table-name}_upper_id_index | update_time | {tablename}_update_time_index ?? | LINE-AGE_SCRIPT_META_COLUMN | resource_id conn_id catalog_name schema_name table_name column_name | primary key |
| resource_id | ordinary index | resource_id_index | table_name column_name | table_column_index | conn_id | conn_id_index | LINE-AGE_META_COLUMN_RELATION | meta_col_id sql_col_id | primary key |
|
LINEAGE_RESOURCE_COLUMN | id | primary key |
| resource_id owner_id col_name | unique index | column_name_index | LINEAGE_COLUMN_RELATION | resource_id column_id | primary key |
| LINEAGE_LOCK | node | primary key | unique_index |
4. Description of table field values 4.1 Type and detail_type field values in the LINEAGE_RESOURCE table [td] Type | Name | DetailType | Name | 1 | Dashboard | 22016 | Favorites | 21248 | Shared from others | 16440 | Dashboard | 16642 | AD-hoc analysis dashboard | 16643 | Vivid Dashboard | 2 | subassemblies | 1 | Chart&Table | 2 | Table | 3 | Pivot | 4 | Image | 5 | Calendar | 6 | Dropdown list Filter | 7 | Range Filter | 8 | Component | 9 | Text | 10 | Gauge | 11 | Text Parameter | 14 | Dropdown List Parameter | 15 | Tab | 16 | Freestyle Table | 17 | Form Parameter | 18 | Group | 20 | Filter Component | 21 | Tree Filtering | 22 | List Parameter | 23 | Commit | 24 | Groups (for merging components in a free layout) | 26 | Web Page | 27 | Customized Component | 28 | Carousel | 3 | dataset | 16897 | SQL dataset | 16905 | Excel dataset | 16904 | Composite dataset | 16910 | Self-service dataset | 16902 | Embedded dataset | 16900 | Data Mart dataset | 16907 | Mongo dataset | 16899 | Custom dataset | 16912 | Multi-Dimension dataset | 156 | Neo4j dataset | 2021 | Streaming dataset | 2023 | RESTful datasets | 4 | physical table | 18435 | database table | 18436 | view | 18438 | Multidimensional dataset Cube | 18442 | Kylin's Cube | 18439 | Property View | 18440 | Analyze View | 18441 | Calculated view | 0 | Hana's database tables | 5 | customizable statement | 1 | SQL script | 5 | 2 | SQL Stored Procedures | 3 | Mongo Pipeline Statements | 4 | RESTful JSON Query Statements | 6 | connections | 0 | Generic | 1 | Oracle | 2 | DB2 | 3 | SQL Server | 4 | MySQL | 5 | Derby | 6 | Informix | 7 | Sybase | 8 | Access | 9 | Vertical | 10 | vividime | 11 | Hive | 12 | Mongo | 13 | Spark | 14 | Postgresql | 15 | Hana | 16 | Kylin | 17 | Max Compute | 18 | Kingbase | 19 | Impala | 20 | HBase | 21 | Presto | 22 | AnalyticDB | 23 | Essbase | 24 | SAP BW | 25 | SSAS | 27 | Teradata | 28 | GBase | 29 | Transwarp | 30 | Sparking | 32 | HuaWeiCloud DWS | 33 | GaussDB 200 | 34 | FusionInsight ELK | 35 | FusionInsight HD | 36 | ELASTIC SEARCH | 37 | GREENPLUM | 40 | ThunderEngine | 41 | GaussDB 100 | 42 | DaMeng | 43 | Thunder Engine | 44 | Neo4j | 45 | Clickhouse | 46 | TBDS | 47 | TBDS OLAP | 48 | CTSDB | 49 | RESTful | 8 | Market Folder | 8 | Market Folder | 9 | Timed task | 17154 | Timed task | 10 | Test | 20480 | Test | 11 | Complex Dashboard | 16644 | Complex Dashboard | 12 | Web Page | 17696 | Web Component |
4.2 Data_type field values in the LINEAGE_RESOURCE_COLUMN table [td] Data_type | Type | English Description | 2 | STRING | string (computer science) | 3 | BOOLEAN | boolean | 4 | FLOAT | floating point | 5 | DOUBLE | floating point | 7 | CHAR | character | 8 | BYTE | byte | 9 | SHORT | short integer | 10 | INTEGER | integer | 11 | LONG | long integer | 12 | DATE_TIME | timestamp | 13 | DATE | date | 14 | TIME | time |
4.3 Value of the col_type field in the LINEAGE_RESOURCE_COLUMN table [td] Col_type | English Description | 9 | Date column | 10 | Calculated column | 11 | Dimension column (script objects) | 12 | Metric column (script objects) | 43 | Simple dynamically calculated column (script objects) | 45 | Customized dynamically calculated column (script objects) | 90 | Data binning column (range) | 178 | Dependent column | 531 | Non-radar map marker fields (script objects) | 545 | Dimension columns (scripted objects, chart-specific) | 596 | Parameter column | 614 | Filter column | 1500 | Grouping columns (script object for chart grouping) | 1501 | Metrics group | 1502 | Map column | 1503 | Data binning column (grouping) | 1507 | Date conversion column | 1508 | Number conversion column | 1512 | value mapping column | 1513 | Missing value populated column | 1514 | Space bar column | 1515 | Disaggregate column | 1517 | MDX dimension column | 1518 | MDX metric column | 1527 | Hana, Kylin metric column | 1802 | Dynamically calculated column | 1803 |
|