Migrating from source cluster to destination cluster
After registering the source and destination cluster, and labeling the scanned datasets and workloads on the source cluster, you can start the migration process.
- Click Migrations on the left navigation pane.
- Click Start Your First Migration.
-
Select Cloudera Distributed Hadoop 5,
Cloudera Distributed Hadoop 6 or CDP
Private Cloud Base as Source Type.
The registered source cluster is selected by default. You can select any other cluster using the drop-down menu . In case you have not registered a source cluster at this point, click New Source and complete the steps in Registering the source cluster.
-
Click Next.
CDP Public Cloud and the registered destination cluster are selected by default. You can select any other cluster using the drop-down menu. In case you have not registered a source cluster at this point, click New Target and complete the steps in Registering the destination cluster.
- Click Next.
- Click Next to confirm the migration path.
-
Select one or more labels for migration migrate to the destination
cluster.
You can select if the migration should Run Now or be completed in a Scheduled Run. Run Now means that all of the datasets and workloads that were selected with the labels are going to be migrated as soon as the process starts. When choosing the Scheduled Run, you can select the start date of the migration, and set a frequency in which the migration process should proceed.
-
Enable YARN migration if required, and provide the
Knox Token to access Cloudera Manager of the Data Hub
cluster in CDP Public Cloud. You also must set the S3 Bucket Base
Path for HDFS or Cloud Storage Path when
migrating HDFS data.
The remaining settings on the Configurations page are automatically filled out, but can be changed based on your requirements.
- Click Next.
-
Review the information on the Overview page and ensure
that the information is correct.
At this point, you can go back and change any configuration if the information is not correct.
-
Click Create to save the migration plan.. You can follow
the progress of creating the migration plan.
- Click Go to Migrations, and select the created CDH to CDP PC or CDP Private Cloud Base to CDP PC migration.
-
Click Run First Step to start the migration.
You can see the status and steps of the migration process.
The Master Table shows a read-only version of the label and the related datasets, and the Configuration details the migration configurations.
The Data & Metadata Migration executes the data migration of the labeled datasets with Replication Manager.
You can also view the migration process of the data and workloads based on the selected services. For example, the Hive SQL Migration replicates the Hive SQL queries that were fixed to be Hive complied during the Hive Workload migraton steps.The Finalization waits until all the Replication Manager policies complete their jobs. If the label is created as a frequently scheduled migration, the Replication Manager waits only for the first jobs.
When migrating from CDP Private Cloud Base to CDP Public Cloud, you need to manually export and import the Ranger policies from the source cluster to the destination cluster using the followingcurl
commands:- Exporting policies
- To export all
policies:
curl -X GET --header "text/json" -H "Content-Type: text/json" -o file.json -u [***USERNAME***]:[***PASSWORD***] "http://[***HOSTNAME***]:[***RANGER PORT***]/service/plugins/policies/exportJson"
- To export for specific HDFS
resource:
curl -X GET --header "text/json" -H "Content-Type: text/json" -o file.json -u [***USERNAME***]:[***PASSWORD***] "http://[***HOSTNAME***]:[***RANGER PORT***]/service/plugins/policies/exportJson?resource%3Apath=[***PATH NAME***]"
- To export for policies for specific resource such as Hive
database and Hive
column:
curl -X GET --header "text/json" -H "Content-Type: text/json" -o file.json -u [***USERNAME***]:[***PASSWORD***] "http://[***HOSTNAME***]:[***RANGER PORT***]/service/plugins/policies/exportJson??resource%3Adatabase=[***DATABASE NAME***]&resource%3Acolumn=[***COLUMN NAME***]"
- To export all
policies:
- Importing policies
- To Import policies from JSON file without
servicesMap:
curl -i -X POST -H "Content-Type: multipart/form-data" -F 'file=@/path/file.json' -u [***USERNAME***]:[***PASSWORD***] http://[***HOSTNAME***]:[***RANGER PORT***]/service/plugins/policies/importPoliciesFromFile?isOverride=true
- To Import policies from JSON file with
servicesMap:
curl -i -X POST -H "Content-Type: multipart/form-data" -F 'file=@/path/file.json' -F ‘servicesMapJson=@/path/servicesMapping.json’ -u [***USERNAME***]:[***PASSWORD***] http://[***HOSTNAME***]:[***RANGER PORT***]/service/plugins/policies/importPoliciesFromFile?isOverride=true
- To Import policies from JSON file without
servicesMap:
- Exporting policies