Transitioning Navigator content to Atlas

During the transition from CDH to CDP Private Cloud Base, you can transition the metadata from Navigator to Apache Atlas, for a scalable and robust infrastructure that supports enhanced metadata searchability by collecting of metadata directly from your cluster.

Cloudera Runtime 7 includes Apache Atlas to collect technical metadata from cluster services. Atlas replaces Cloudera Navigator Data Management for these clusters. Cloudera has incorporated many features from Navigator into Apache Atlas to make sure that the rich metadata collected in Navigator can be represented in Atlas. Atlas provides scalable and robust infrastructure that supports metadata searches and lineage across enterprise production clusters.

You may choose not to transition Navigator content to Atlas at all: this document describes how to think about archiving your Navigator audits and metadata.

Whether you choose to transition Navigator contents to Atlas or not, this document describes how to use Atlas to accomplish the tasks you are accustomed to performing in Navigator.

What's transitioned?

Business metadata is transitioned into Atlas, including:

  • Tags
  • Custom properties (definitions and entity assignments)
  • Managed metadata properties (definitions and entity assignments)
  • Original and updated entity names and descriptions

Technical metadata from the following sources are transitioned into Atlas:

  • Hive
  • Impala
  • Spark
  • Referenced HDFS / S3

What's NOT transitioned?

  • Audits. In CDP, Ranger collects audit information for successful and failed access to objects under its control. This audit system is focused and powerful, but it's enough different from how Navigator collected audits that transition isn't appropriate. This document includes information on how to transition your auditing to Ranger and how to archive your existing Navigator audit information.
  • Entity Metadata. The following metadata entities in Navigator are not transitioned to Atlas:
    • Unreferenced S3 and HDFS entities. Files in HDFS and S3 that are not included in lineage from Hive, Spark, or Impala entities are not transitioned.
    • Metadata for Sqoop, Pig, Map-Reduce v1 and v2, Oozie, and YARN.
  • Policies. Navigator policies are not transitioned to Atlas.
  • Configuration settings. Configuration properties you've set in Cloudera Manager that determine Navigator behavior are not transitioned to the new environment. If you have properties that may apply in Atlas, such as authentication credentials, you'll need to reset them in the new environment.

Will Navigator still run in Cloudera Manager?

After upgrading Cloudera Manager to CDP, Navigator continues to collect metadata and audit information from CDH cluster services. There are no changes to Navigator functionality; all Navigator data is retained in the Cloudera Manager upgrade.

After upgrading a CDH cluster, services that previously sent metadata and audit information to Navigator, such as Hive, Impala, Spark, and HBase, are configured to pass metadata to Atlas. Navigator audit collection for those services is disabled. You can still access audits and metadata through Navigator; however, Navigator will not collect new information from cluster services. When you determine that you have maximized the value of the Navigator audits and successfully converted Navigator metadata to Atlas content, you can proceed to disable Navigator servers. Afterward, you may proceed with removing the role from the cluster.