Reviewing prerequisites before migration
Before migrating from CDH 5, CDH 6 or Cloudera Private Cloud Base to Cloudera Public Cloud, review the list of prerequisites that are required for the migration process.
- Ensure that the Cloudera Migration Assistant server is deployed as described in Setting up Cloudera Migration Assistant server.
- The CDH 5 source cluster minimum version requirement is CDH 5.16.1 and CDH 5.16.2 in case of HBase migration.
- CDH 6 source cluster minimum version requirement is CDH 6.3.3.
- The Cloudera Private Cloud Base source cluster minimum version requirement is 7.1.7.
- For HBase migration, you need either of the following parcels procured from Cloudera
Professional Services:
CLOUDERA_OPDB_REPLICATION-1.0-1.CLOUDERA_OPDB_REPLICATION5.14.4.p0.31473501-el7.parcel
CLOUDERA_OPDB_REPLICATION-1.0-1.CLOUDERA_OPDB_REPLICATION6.3.3.p0.8959316-el7.parcel
- For data and metadata migration, you need a Data Lake cluster already created in a Cloudera Public Cloud environment. To create a Data Lake cluster, you can follow the process described in Registering an AWS environment and Registering an Azure environment based on your cloud provider.
- For a Hive workload migration, you need a Cloudera Data Engineering Data Hub already created in a Cloudera Public Cloud environment. To create a Cloudera Data Engineering Data Hub cluster, you can follow the process described in Creating a cluster on AWS and Creating a cluster on Azure based on your cloud provider.
- You must use the Cluster Connectivity Manager to manually register the source CDH cluster as a classic cluster in the Cloudera Control Plane, following the process described in Adding a CDH cluster (CCMv2).
- Information to gather before you begin the migration:
- For the source CDH cluster: The Cloudera Manager URL, Admin username and password, SSH user, port, and private key of source nodes
- For the destination Cloudera cluster/environment: Cloudera Control Plane URL, Admin username and password, SSH user, port, and private key
- In S3: S3 bucket access key and S3 bucket secret key, S3 credential name. Potentially, you might also need the S3 bucket base path for HDFS files, S3 bucket path for Hive external tables (these paths should auto-fill from the selected destination cluster, but can be changed if needed)
- The Cloudera Manager node of the source CDH cluster must have Python 3.8.12 or higher installed.
- Redaction needs to be off in Cloudera Manager. To disable redaction in Cloudera Manager, you can follow the process described in Disabling Redaction of sensitive information.