Estimating the time and resources needed for transition
While the cluster is starting up, you can plan for and start the transition process.
- Inspect Navigator installation to determine the number of Navigator entities that will be transitioned.
- Estimate the time and disk space required for each phase of the transition.
The following transition rates are approximate and depend on the resources available on the Atlas host and other unknown factors. Note that the number of entities actually imported may be considerably less that the number of entities extracted. The transition process discards HDFS entities that are not referenced by processes that are transitioned (Hive, Impala, Spark).
Transition Phase | Transition Rate | Disk Space | Output File Size | Trial Data Points |
---|---|---|---|---|
Extraction | 4 minutes / 1 million entities | 100 MB / 1 million entities, less as volumes increase | 65 MB / 1 million entities | 10 million entities takes about 30 minutes; 256 million takes about 18 hours. |
Transformation | 1.5 minutes / 1 million entities | 100 to 150 MB / 1 million entities, higher end of range with larger volumes | 150 MB / 1 million entities | 10 million entities takes about 20 minutes; 256 million takes about 6 hours. |
Import | 35 minutes / 1 million migrated entities | N/A | N/A | 10 million entities takes about 4 hours; 256 million takes about 6 days. |