Estimating migration time and resources required

You must plan your migration process by checking on the time and resources that are required.

While the cluster is starting up, you can plan for and start the migration process.

  1. Inspect Navigator installation to determine the number of Navigator entities that will be migrationed. See Understanding migrated Navigator entites
  2. Estimate the time and disk space required for each phase of the migration.

    The following migration rates are approximate and depend on the resources available on the Atlas host and other unknown factors. Note that the number of entities actually imported may be considerably less than the number of entities extracted. The migration process discards HDFS entities that are not referenced by processes that are migrationed such as Hive, Impala, and Spark).

Migration Phase Migration Rate Disk Space Output File Size Trial Data Points
Extraction 4 minutes / 1 million entities 100 MB / 1 million entities, less as volumes increase 65 MB / 1 million entities 10 million entities takes about 30 minutes; 256 million takes about 18 hours.
Transformation 1.5 minutes / 1 million entities 100 to 150 MB / 1 million entities, higher end of range with larger volumes 150 MB / 1 million entities 10 million entities takes about 20 minutes; 256 million takes about 6 hours.
Import 35 minutes / 1 million migrated entities N/A N/A 10 million entities takes about 4 hours; 256 million takes about 6 days.