Support matrix for Replication Manager on CDP Private Cloud Base

Replication Manager replicates HDFS, Hive, and Impala data, and supports Sentry to Ranger replication from CDH (version 5.10 and higher) clusters to CDP Private Cloud Base (version 7.0.3 and higher) clusters.

Replication Manager supports the following features:
  • Replication to and from Amazon S3 from CDH 5.14 and Cloudera Manager version 5.13.

    Replication Manager does not support S3 as a source or destination when S3 is configured to use SSE-KMS.

  • Replication to and from Microsoft ADLS Gen1 from CDH 5.13 and Cloudera Manager 5.15, 5.16, 6.1.
  • Replication to Microsoft ADLS Gen2 (ABFS) from CDH 5.13 and Cloudera Manager 6.1.
  • Snapshots from CDH 5.15 and Cloudera Manager 5.15.

Starting from Cloudera Manager 6.1.0, Replication Manager ignores Hive tables backed by Kudu during replication. The change does not affect functionality because Replication Manager does not support tables backed by Kudu. This change was made to guard against data loss due to how the Hive Metastore, Impala, and Kudu interact.

Replicate data from CDH and CDP Private Cloud Base source clusters

Click the required tab to view the support matrix for Replication Manager for CDH source clusters and CDP Private Cloud Base source clusters. The table lists the source and destination clusters, lowest supported versions of Cloudera Manager, and the services that are available for each supported cloud provider:

Source cluster Lowest supported source Cloudera Manager version Lowest supported source Cloudera Runtime version Destination cluster Supported services on Replication Manager
CDH 5

CDH 6

6.3.0 5.10 CDP Private Cloud Base 7.0.3 HDFS, Sentry to Ranger, Hive external tables
Source cluster Lowest supported source Cloudera Manager version Lowest supported source Cloudera Runtime version Destination cluster Supported services on Replication Manager
CDP Private Cloud Base 7.1.1 7.1.1 CDP Private Cloud Base
  • HDFS
  • Hive external tables

Replicate data from HDP 2 and HDP 3 source clusters

Replicating to and from HDP to Cloudera Manager 7.x is not supported by Replication Manager. However, you can replicate data using other methods. The following table lists the methods and the supported data replications to CDP Private Cloud Base clusters that are supported:

Lowest supported source version Services that require alternate replication methods
HDP 2.6.5 HDFS. Use DistCp to replicate data.
HDP 3.1.1 HDFS. Use DistCp to replicate data.
HDP 3.1.1
  • HBase. Use HBase replication to replicate HBase data.
  • Hive external tables. For information to replicate data, contact Cloudera Support.

HDP 3.1.5 Hive ACID tables to CDP 7.1.6 and higher clusters. Use REPL commands to replicate data.

Supported replication scenarios

Sentry-related replication
To perform Sentry to Ranger replication using HDFS and Hive replication policies, you must have installed Cloudera Manager version 6.3.1 and higher on the source cluster and Cloudera Manager version 7.1.1 and higher on the target cluster.
When the source cluster is Sentry-enabled and you want to run HDFS replication policies, use the hdfs user to run the replication policy. The replication policy copies the permissions of replicated files and tables to the target cluster. To use any other user account, make sure that you configure the user account to bypass Sentry ACLs during replication.
When you create a Hive replication policy, choose the appropriate options to ensure that the Sentry permissions are migrated to Ranger permissions. The Replication Manager uses the authzmigrator tool to move data from Sentry to Ranger during Hive replication.
Kerberos
Replication Manager supports the following replication scenarios when Kerberos authentication is used on a cluster:
  • Secure source to a secure destination.
  • Insecure source to an insecure destination.
  • Insecure source to a secure destination. The following requirements must be met for this scenario:
    • When a destination cluster has multiple source clusters, all the source clusters must either be secure or insecure. Replication Manager does not support a mix of secure and insecure source clusters.
    • The destination cluster must run Cloudera Manager 7.x or higher.
    • The source cluster must run a compatible Cloudera Manager version.
    • This replication scenario requires additional configuration. For more information, see Replicating from unsecure to secure clusters.
Transport Layer Security (TLS)
You can use TLS with Replication Manager. Additionally, Replication Manager supports replication scenarios where TLS is enabled for non-Hadoop services (Hive/Impala) and TLS is disabled Hadoop services (such as HDFS,YARN, and MapReduce).
Apache Knox
When Cloudera Manager is configured with Knox and the source and target clusters are Knox-SSO enabled, you must ensure that you use the Cloudera Manager port in the peer URL when you add the source and target clusters as peers.