Migrating Atlas data

After Atlas is available on the CDP Private Cloud Base cluster, you must import the Atlas data that the Atlas migration exporter utility exported from the HDP 3.1.5.x cluster.

Use these properties to improve the speed of Atlas data import:

For a node with 4 cores and 8 GB of heap space, the estimated duration for import is 0.75 million entities per hour.

  1. Navigate to Atlas > Configs > Advanced > Custom application-properties:
  2. Configure atlas.migration.mode.batch.size: Recommended value is 3000.
  3. Configure atlas.migration.mode.workers: Value to be set depends on the number of cores on the node on which Atlas runs. Typically, set the value as (number of cores - 1) * 2. For an 8 core node, set this property to (8 - 1) * 2 = 14.

Additional patches are applied after the migration is completed. These properties help with improving the speed of patches.

  1. Configure Atlas in CDP Private Cloud Base with the location of the exported data.
  2. Configure atlas.migration.data.filename property.
  3. In Cloudera Manager, navigate to Clusters and select Atlas.
  4. From Atlas configuration, set the Advanced Configuration Snippet (Safety Valve) value to the location which contains exported Atlas data. For example,
  5. Set additional properties:
    • atlas.migration.mode.batch.size=3000.
    • atlas.migration.mode.workers=<use the value from calculation above>
    • atlas.patch.batchSize=3000
    • atlas.patch.numWorkers=<use the value from calculation above>
    For example:
    • atas.migration.data.filename=/var/lib/atlas-data
    • atlas.migration.mode.batch.size=3000.
    • atlas.migration.mode.workers=14
    • atlas.patch.batchSize=3000
    • atlas.patch.numWorkers=14
  6. Save the configuration.
  7. Restart Atlas from Cloudera Manager. Go to Atlas > Actions > Restart.
  8. Atlas starts in the migration mode and data import should commence. During the migration process, Atlas blocks all the REST API calls and Atlas Hook notification processing.
  9. To check the migration status
    http://[atlas_server]:21000/api/atlas/admin/status
    The migration status is displayed in the browser window:
    {"Status":"Migration","currentIndex":139,"percent":67,"
    startTimeUTC":"2018-04-06T00:54:53.399Z"}
    
  10. The progress of import can be monitored using Atlas logs on the node where the migration is running. On the completion of migration, the log should have this entry: Done! loadLegacyGraphSON. (GraphDBGraphSONMigrator:76)
  11. After the migration is complete, change the status of Atlas from migration mode to normal operation by removing the atlas.migration.data.filename property and restarting Atlas in Cloudera Manager.