Use Cloudera Replication Manager to migrate to Cloudera
Public Cloud
Cloudera Replication Manager is a service to copy and migrate data from CDH
5.13+ and above clusters (HDFS, Hive, and HBase data) and Cloudera Private Cloud Base 7.1.4 and above clusters (HDFS , Hive
external tables, and HBase data) to Cloudera
Public Cloud clusters. The supported Public Cloud services include Amazon S3 or Microsoft Azure ADLS
Gen2 (ABFS). Replication Manager from HDP clusters to Cloudera
Public Cloud Azure is a beta feature and is not available for
general use.
About Replication Manager Replication Manager is a service in Cloudera Public Cloud . You can create replication policies in Replication Manager to copy and migrate data from CDH (version 5.13 and higher) clusters (HDFS, Hive, and HBase data) and Cloudera Private Cloud Base (version 7.1.1 and higher) clusters (HDFS, Hive external tables, and HBase data) to Cloudera Public Cloud clusters. You can also replicate HDFS data from cloud storage to classic clusters (CDH or Cloudera Private Cloud Base clusters), and Hive external tables to Data Hubs. The supported Public Cloud services include Amazon S3 and Microsoft Azure ADLS Gen2 (ABFS). Replicating Hive managed tables using Replication Manager from HDP clusters to Cloudera Public Cloud is a beta feature and is not available for general use.Fine-grained permission to access Cloudera Replication Manager You can restrict access to specific users to view and use Cloudera Replication Manager in a Cloudera Public Cloud environment so that you can govern the access to critical replication functionalities. Accessing Replication Manager UI You can access the Replication Manager user interface by logging into Cloudera Data Platform > Select Replication Manager .How replication policies work In Cloudera Replication Manager , you create replication policies to establish the rules you want applied to your replication jobs. The policy rules you set can include which cluster is the source and which is the destination, what data is replicated, what day and time the replication job occurs, the frequency of job runs, and bandwidth restrictions.Using HDFS replication policies You can use the HDFS replication policies in Cloudera Replication Manager to replicate HDFS data. The HDFS replication policies can replicate HDFS data and metadata from classic clusters (CDH, Cloudera Private Cloud Base , and HDP) to Cloudera Public Cloud storage buckets such as S3 and ABFS, and from cloud storage to classic clusters (CDH or Cloudera Private Cloud Base clusters). To use an on-premises cluster (CDH or Cloudera Private Cloud Base ) in the replication policy, you must register it as a classic cluster in the Cloudera Management Console . To use the cloud storage for data replication, you must register the cloud credentials in Replication Manager so that the Replication Manager service can access the cloud storage. You must also verify cluster access and configure minimum ports for replication before you create HDFS replication policies.Using Hive replication policies To create a Hive replication policy in Cloudera Replication Manager , you must configure the required Ranger policy in Ranger, register the on-premises cluster (CDH or Cloudera Private Cloud Base) as a classic cluster in Cloudera Management Console , register cloud account credentials in the Replication Manager service, verify cluster access, and configure minimum ports for replication. The replication load happens on the source on-premises cluster. You can replicate data on-premises to the cloud with a single cluster if the Metastore is running on the cloud.Using HBase replication policies To create an HBase replication policy in Cloudera Replication Manager , you must register the on-premises cluster (CDH or Cloudera Private Cloud Base ) as a classic cluster in Cloudera Management Console , register cloud account credentials in the Replication Manager service, verify cluster access, and configure minimum ports for replication. Troubleshooting replication policies in Cloudera Replication Manager The troubleshooting scenarios in this topic help you to troubleshoot issues in the Cloudera Replication Manager .Appendix Before you create replication policies, you must register the Amazon S3 or Azure cloud credentials to use as cloud storage in Cloudera Replication Manager , and register the on-premises clusters (CDH or Cloudera Private Cloud Base ) as classic clusters in the Management Console.