Installing Apache Atlas
Also available as:
PDF

Migrate Atlas metadata when upgrading to HDP-3.0+

Perform the following steps to migrate the Atlas metadata from Titan to JanusGraph when upgrading from HDP-2.x to HDP-3.0 and higher versions.

  1. Before upgrading HDP and Atlas, use one of the following methods to determine the size of the Atlas metadata on the HDP-2.x cluster.
    • Click SEARCH on the Atlas web UI, then slide the green toggle button from Basic to Advanced. Enter the following query in the Search by Query box, then click Search.
      Asset select count() 
    • Run the following Atlas metrics REST API query:
      curl -g -X GET -u admin:admin -H "Content-Type: application/json" /
      -H "Cache-Control: no-cache" "http://<atlas_server>:21000/api/atlas/admin/metrics"

    Either of these methods returns the number of Atlas entities, which can be used to estimate the time required to export the Atlas metadata from HDP-2.x and import it into HDP-3.x. This time varies depending on the cluster configuration. The following estimates are for a node with a 4 GB RAM quad-core processor with both the Atlas and Solr servers on the same node:

    • Estimated duration for export from HDP-2.x: 2 million entities per hour.
    • Estimated duration for import into HDP-3.x: 0.75 million entities per hour.
    The Atlas' migration exporter utility is used for migrating Atlas from HDP 2.x to HDP 3.x and beyond. Download from the location: https://archive.cloudera.com/am2cm/hdp2/atlas-migration-exporter-0.8.0.2.6.6.0-332.tar.gz
  2. Before upgrading HDP and Atlas, perform the following steps on the HDP-2.x cluster.
    1. Replace contents of /usr/hdp/2.6.<current version>/atlas/tools/migration-exporter/
    2. Modify the permissions using chown -R atlas:atlas <directory above>
    3. Execute the tool from location above. atlas_migration.py -d <output directory>
    4. On the Ambari dashboard, click Atlas, then select Actions > Stop.
    5. Use the HDP-2.6. exporter tool to run the export. Typically the tool is located at /usr/hdp/2.6.<current version>/atlas/tools/migration-exporter/. Use the following command format to start the exporting the Atlas metadata:
      python /usr/hdp/2.6.<current version>/atlas/tools/migration-exporter/atlas_migration.py -d <output directory>

      While running, the Atlas migration tool prevents Atlas use, and blocks all REST APIs and Atlas hook notification processing.

      As described previously, the time it takes to export the Atlas metadata depends on the number of entities and your cluster configuration. You can use the following command to display the export status:

      tail -f /var/log/atlas/atlas-migration-exporter.log

      When the export is complete, the data is placed in the specified output directory.

    6. On the Ambari dashboard, Select Atlas > Configs > Advanced > Custom application-properties. Click Add Property, then add an atlas.migration.data.filename property and set its value to point to the full path to the atlas-migration-data.json file in the output folder you specified when you exported the HDP-2.x data.
  3. Upgrade HDP and Atlas.
  4. The upgrade starts Atlas automatically, which initiates the migration of the uploaded HDP-2.x Atlas metadata into HDP-3.x. During the migration import process, Atlas blocks all REST API calls and Atlas hook notification processing.
    You can use the following Atlas API URL to display the migration status:
    http://<atlas_server>:21000/api/atlas/admin/status

    The migration status is displayed in the browser window:

    {"Status":"Migration","currentIndex":139,"percent":67,"startTimeUTC":"2018-04-06T00:54:53.399Z"}
  5. When the migration is complete, select Atlas > Configs > Advanced > Custom application-properties, then click the red Remove button to remove the atlas.migration.data.filename property.