Cloudera Navigator Metadata Architecture

Cloudera Navigator metadata provides data discovery and data lineage functions. The following figure depicts the Cloudera Navigator metadata architecture.



The Navigator Metadata Server performs the following functions:
  • Obtains connection information about CDH services from the Cloudera Manager Server
  • At periodic intervals, extracts metadata for the entities managed by those services
  • Manages and applies metadata extraction policies during metadata extraction
  • Indexes and stores entity metadata
  • Manages authorization data for Cloudera Navigator users
  • Manages audit report metadata
  • Generates metadata and audit analytics
  • Implements the Navigator UI and API

The Navigator database stores policies, user authorization and audit report metadata, and analytic data. The storage directory stores the extraction state and extracted metadata.

The Cloudera Navigator Metadata Server manages metadata about the entities in a CDH cluster and relations between the entities. The metadata schema defines the types of metadata that are available for each entity type it supports.

For example, the following figure shows the entity details of a file entity:



Three classes of metadata are defined for all entities:
  • Technical Metadata - Metadata defined when entities are extracted. Such metadata includes:
    • Name of an entity
    • Service that manages or uses the entity
    • Type
    • Path to the entity
    • Date and time of creation
    • Access permissions
    • Modification, size, owner, purpose, and relations—parent-child, data flow, and instance of—between entities
    You cannot modify technical metadata.
  • Custom Metadata - Key-value pairs that can be added to entities. You can add and modify custom metadata before and after entities are extracted.
  • Managed Metadata - Descriptions, key-value pairs, and tags that can be added to entities. Managed metadata key-value pairs are similar to custom metadata key-value pairs, but can also define the keys within a namespace and enforce conformance to value constraints (for example, require the value to be a date). You can add and modify managed metadata after entities are extracted.
In addition, for Hive entities, Cloudera Navigator supports extended attributes, which are added by Hive clients before entities are extracted.