Before migrating from Navigator to Apache Atlas, review the migration paths. You must
extract, transform, and import the content from Navigator to Apache Atlas. After the
migration is completed, services start producing metadata for Atlas and audit information
for Ranger.
There are two main paths that describe a Navigator-to-Atlas migration
scenario:
Upgrading Cloudera Manager to CDP 7.0.0 and upgrading all your CDH clusters to
CDP Runtime. In this case, you can stop Cloudera Navigator after migrating its
content to Atlas.
Upgrading Cloudera Manager to CDP 7.0.0 but managing some or all of
your existing CDH clusters as CDH 5.x or 6.x. In this case, the CDP cluster
running Cloudera Navigator continues to extract metadata and audit information
from existing CDH clusters and runs Atlas and Ranger to support metadata and
audit extraction from new or potential new CDP runtime clusters.
In both the scenarios, you shall complete the upgrade of Cloudera Manager
first. While Cloudera Manager is upgrading, Navigator pauses collection of metadata and
audit information from cluster activities. After the upgrade is complete, Navigator
processes the queued metadata and audit information.
In the timeline diagrams that follow, the blue color indicates steps and
because you trigger the steps manually, you can control their timing.
The migration of Navigator content to Atlas occurs during the upgrade from CDH
to CDP. The migration involves three phases:
Extracting metadata from Navigator
The Atlas
installation includes a script (cnav.sh) that calls Navigator APIs to extract
all technical and business metadata from Navigator. The process takes about 4
minutes per one million Navigator entities. The script compresses the result and
writes it to the local file system on the host where the Atlas server is
installed. Plan for about 100 MB for every one million Navigator entities; lower
requirements for larger numbers of entities.
Transforming the Navigator metadata into a form that Atlas can consume.
Including time and resources. The Atlas installation includes a script
(nav2atlas.sh) that converts the extracted content and again compresses
it and writes it to the local file system. This process takes about 1.5 minutes per
million Navigator entities. The script compresses the results and writes it to the
local file system on the host where the Atlas server is installed. Plan for about
100 to 150 MB for every million Navigator entities; higher end of the range for
larger numbers of entities.
Importing the transformed metadata into Atlas.
After the CDP upgrade
completes, Atlas starts in "migration mode," where it waits to find the
transformed data file and does not collect metadata from cluster services. When
the transformation is complete, Atlas begins importing the content, creating
equivalent Atlas entities for each Navigator entity. This process takes about 35
minutes for a million Navigator entities, counting only the entities that are
migrated into Atlas.
To make sure you do not miss metadata for cluster operations, provide time
after the Cloudera Manager upgrade and before the CDH upgrade for Navigator, to process
all the metadata produced by CDH service operations. See Navigator Extraction Timing for more
information.
You can start extracting metadata from Navigator as soon as the CDP parcel is deployed on
the cluster. After CDP is started, Navigator no longer collects metadata or audit
information from the services on that cluster; instead services produce metadata for
Atlas and audit information for Ranger.
Migration from Navigator to Atlas can be run only in non-HA mode
Migration import works only with a single Atlas instance.
If Atlas has been set up in HA mode before migration, you must remove the
additional instances of Atlas, so that Atlas service has only one instance.
Later, start Atlas in the migration mode and complete the migration.
Perform the necessary checks to verify if the data has been imported correctly.
Restart Atlas in non-migration mode.
If you have Atlas setup in HA mode, retain only one instance and
remove the others.
Ensure that the ZIP files generated as an output from the Nav2Atlas
conversion are placed at the same location where the Atlas node is present.