Service Metadata Entity Types
Metadata entities are the data assets stored in and operations generated by services running on the cluster. Users with the appropriate permissions (Metadata and Lineage Viewer, Full Administrator) can view metadata entities in the Cloudera Navigator console or by using the APIs.
The Cloudera Navigator console includes entities extracted from services that are enabled for integration with Navigator Metadata Service.
Navigator creates the following entities from metadata extracted from these services:
Service | Navigator Entity Types Created from Metadata |
---|---|
HDFS | directory, file |
HiveServer2 | database, table, field, operation, operation_execution, sub_operation, partition, view |
Impala | operation, operation_execution, sub_operation |
Mapreduce (v1 and v2) | operation, operation_execution |
Oozie | operation, operation_execution |
Pig | operation, operation_execution |
Spark (v1 and v2) | operation, operation_execution |
Sqoop (v1) | operation, operation_execution, sub_operation |
YARN | operation, operation_execution |
Cluster | cluster_template,cluster_instance |
S3 | file, s3bucket |
Hive Operations and Cloudera Navigator Support Matrix
The table below lists Hive DDL and DML statements and whether Cloudera Navigator supports the operation with metadata and lineage extraction.
Hive operations | Cloudera Navigator Metadata Support | Comment |
---|---|---|
Abort | Operation does not generate data flow. | |
Alter Table/Partition/Column | ALTER TABLE RENAME TO does not create query entity. ALTER TABLE CHANGE column name does not create query entity. . ALTER commands do not result in lineage relationships. | |
Create Table | ||
Create/Drop Macro | Operation does not generate data flow. | |
Create/Drop/Alter Index | Operation does not generate data flow. | |
Create/Drop/Alter View | Operation does not generate data flow. | |
Create/Drop/Alter/Use Database | Operation does not generate data flow. | |
Create/Drop/Grant/Revoke Roles and Privileges | Operation does not generate data flow. | |
Create/Drop/Reload Function | Operation does not generate data flow. | |
DELETE | Requires ACID support. Hive ACID not supported. | |
Describe | Operation does not generate data flow. | |
Drop/Truncate Table | ||
EXPORT | Known Issue. . External tables can be exported to HDFS but Cloudera Navigator does not create a query entity for the EXPORT. | |
IMPORT | Known Issue. External tables can be imported from HDFS but Cloudera Navigator does not create a query entity for the IMPORT. | |
INSERT data into Hive Tables from queries | ||
INSERT data into the filesystem from queries | ||
INSERT values into tables from SQL | ||
LOAD | Known Issue. LOADing a CSV from HDFS into an existing Hive table does not generate lineage. | |
MERGE | Requires ACID support. Hive ACID not supported. | |
MSCK REPAIR | Tables track their respective partitions. Queries to create or repair partitions using MSCK are not captured as query entities. | |
Show | Operation does not generate data flow. | |
UPDATE | Requires ACID support. Hive ACID not supported. |
Cloudera Navigator extractor for Hive does not support all Hive statements, specifically these:
- Table-generating functions
- Lateral views
- Transform clauses
- Regular expressions (regex) in SELECT clause
Hive queries that include any of the above will prevent lineage diagrams from completing successfully.
Categories: Events | Lineage | Metadata | Navigator | All Categories