OperationsPDF version

Support matrix for Cloudera Replication Manager

You can use Replication Manager or other alternate replication methods to replicate HDFS, Hive external tables, and HBase data between on-premises clusters (CDH clusters, Cloudera Base on premises clusters, HDP clusters) and Cloudera on cloud (Amazon S3 (AWS), Microsoft Azure ADLS Gen2 (ABFS), and Google Cloud Platform (GCP)) clusters. Replication Manager from HDP clusters to Cloudera on cloud Azure is a beta feature and is not available for general use.

Replication Manager provides replication policies that you can create, edit, and manage to accomplish your data replication goals. You can use other alternate replication methods for scenarios that Replication Manager does not support. Certain features in Cloudera Replication Manager are available only if the source and target clusters' Cloudera Manager versions support the feature. Verify whether your source and target cluster's Cloudera Manager version support the required feature.

You can use the following replication policies in Cloudera Replication Manager:
HDFS replication policies
Replicate HDFS data and metadata from:
  • on-premises clusters (CDH, Cloudera Base on premises, and HDP) to cloud storage.
  • cloud storage to Classic Clusters (CDH or Cloudera Base on premises clusters).
You can choose the frequency during policy creation to replicate the data.
Hive replication policies
Support table-level replication, and can replicate Hive external tables from on-premises clusters (CDH and Cloudera Base on premises) to cloud storage and to Data Hubs. The replication policies can also:
  • replicate data stored in Hive tables, Hive metadata, data in Hive metastore, and Impala metadata (catalog server metadata) associated with Impala tables registered in the Hive metastore, and
  • migrate Sentry permissions to Ranger.
You can choose the frequency during policy creation to replicate the data.
HBase replication policies
Replicate HBase data from a source classic cluster (CDH or Cloudera Base on premisescluster), COD, or Data Hub to a target Data Hub or COD cluster. You can also copy or replicate HBase data between different environments within a Virtual Private Cloud (VPC) using these policies.
Table 1. Supported cluster and runtime versions for HBase replication policies
Source Cluster Type Lowest Supported Source CDH/Cloudera Version Lowest Supported Source Cloudera Manager Version Target Cluster Type Lowest Supported Target Cloudera Version Lowest Supported Target Cloudera Manager Version
CDH 5.16.2 7.4.4 (patch-5017) COD (AWS) 7.2.14 -
CDH 5.16.2
  • 7.6.1 (patch-5610)
  • 7.6.7 CHF1 and higher
COD (Azure) 7.2.14 -
CDH 6.3.3 7.3.1 Data Hub in Cloudera on cloud AWS/Azure 7.2.14 7.6.0
Cloudera Base on premises 7.1.6* 7.3.1 Data Hub in Cloudera on cloud AWS/Azure 7.2.14 7.6.0
Cloudera Base on premises 7.1.9 SP1 7.11.3 CHF7 GCP 7.2.16.1 -
COD (AWS/Azure) 7.2.14 - COD (AWS/Azure) 7.2.14 -
COD (GCP) 7.2.16.1** - COD (GCP) 7.2.16.1** -
COD (GCP) 7.2.16.500

7.2.17.300

7.2.18.0

- COD (GCP) 7.2.16.500

7.2.17.300

7.2.18.0

-
*Cloudera Base on premises 7.1.6 and higher clusters must be Kerberos enabled to use them as source classic clusters in an HBase replication policy.

**You must add key-value pairs to register a Google account to use in Replication Manager. For more information about the key-value pairs, see Preparing to create an HBase replication policy.

HBase replication policies replicate all the data from the specified tables and then continue to replicate the changed data automatically without user intervention.
The following table lists the features and the Cloudera Manager instances that are required for source clusters and target clusters to use the features:
Feature Lowest supported source Cloudera Manager version Lowest supported target Cloudera Manager version
Register the GCP credentials to use in Replication Manager on the Cloud Credentials page.
  • 7.9.0-h7 and higher
  • 7.11.0-h3 and higher
  • 7.12.0.0 and higher
Supports all Cloudera on cloud Cloudera Manager versions.
Replicate HBase data simultaneously between multiple clusters*.
  • 7.9.0-h7 and higher
  • 7.11.0-h2 and higher
  • 7.12.0.0 and higher
  • 7.9.0-h7 and higher
  • 7.11.0-h2 and higher
  • 7.12.0.0 and higher
Replicate only those HBase tables where the replication scope is already enabled using the Select Source > Replicate only tables where replication is already enabled* option during the HBase replication policy creation process. Supports all Cloudera on cloud Cloudera Manager versions.
  • 7.9.0-h7 and higher
  • 7.11.0-h3 and higher
  • 7.12.0.0 and higher
Specify the network load balancer (NLB) Endpoint after you enable the Select Destination > Replicate via a Network Load Balancer* option during the HBase replication policy creation process if the on-premises cluster uses NLB to communicate with the COD clusters. CDH 5.16.2 7.12.0.100
Specify the YARN queue bandwidth using the Initial Snapshot Settings > Maximum Bandwidth* option during the HBase replication policy creation process to export the HBase initial snapshot.
  • 7.9.0-h7 and higher
  • 7.11.0-h3 and higher
  • 7.12.0.0 and higher
  • 7.9.0-h7 and higher
  • 7.11.0-h3 and higher
  • 7.12.0.0 and higher
Enter Initial Snapshot Settings > Maximum parallel snapshots* to specify the maximum number of tables to process in parallel during the initial snapshot export and import step for an HBase replication policy.

If you do not enter any value, Replication Manager chooses an appropriate value, depending on the resources in the source and target cluster, to optimize the performance.

Supports all Cloudera on cloud Cloudera Manager versions.
  • 7.9.0-h7 and higher
  • 7.11.0-h3 and higher
  • 7.12.0.0 and higher
Add IDBroker credentials* to use in Replication Manager on the Cloud Credentials page. 7.11.3 CHF7 7.11.3 CHF7
Enter the Select Source > Export snapshot user* field during the HBase replication policy creation process to specify the username to export the initial snapshot to the target. 7.11.3 CHF7 7.11.3 CHF7
*To enable this feature, contact your Cloudera Account team.

Replication Manager replicates HDFS (Cloudera Base on premises source clusters and Cloudera on cloud storage on AWS and Azure), Hive external tables (Cloudera Base on premises source clusters), and HBase (Cloudera Base on premises source clusters) data to Cloudera on cloud (Amazon S3 and Microsoft Azure ADLS Gen2 (ABFS)) clusters. You can use the replication plugin as an alternate replication method to replicate HBase data for scenarios that are not supported by Replication Manager.

The following tables list the minimum source and destination cluster versions, minimum Cloudera Manager versions, supported cloud providers, and supported scenarios:

Source cluster Lowest supported source Cloudera Manager version Lowest supported source Cloudera Runtime version Cloud provider Supported services on Replication Manager Services that require alternate replication methods
Cloudera Base on premises 7.1.1 7.1.1 Cloudera on cloud AWS/Azure HDFS HBase

To replicate HBase data, see COD replication in a Nutshell and HBase data replication.

Cloudera Base on premises 7.1.1 7.1.1 Data Lake in Cloudera on cloud AWS/Azure Hive external tables
Cloudera Base on premises 7.9.0 7.1.1 Data Hub in Cloudera on cloud AWS/Azure Hive external tables None
Cloudera Base on premises 7.3.1 7.1.6 Data Hub in Cloudera on cloudAWS/Azure HBase None
Cloudera Base on premises 7.11.3 CHF7 7.1.9 SP1 Cloudera on cloud GCP HDFS, Hive external tables, HBase None
Consider the following limitations while using Cloudera on cloud source and Cloudera on cloud target clusters:
  • Replication across cross-cloud providers, that is from AWS to Azure and vice-versa is not supported.
  • The source and target clusters must use the same account.
Source cluster Destination cluster Supported services on Replication Manager Services that require alternate replication methods
Cloudera on cloud AWS* / Azure CDH 5.x

CDH 6.x

HDP 2.x

HDP 3.x

Not applicable HBase

To replicate HBase data, see COD replication in a Nutshell and HBase data replication.

Cloudera on cloudAWS* CDH 5.9.0 and higher

Cloudera Base on premises 7.1.7 SP1 and higher

HDFS None
Cloudera on cloud Azure CDH 6.1.0 and higher

Cloudera Base on premises 7.1.7 SP1 and higher

HDFS None
Cloudera on cloud GCP 7.2.18 and higher Cloudera Base on premises 7.1.9 SP1 and higher HDFS None
COD version 7.2.14 and higher - Cloudera on cloud AWS AWS HBase None
COD version 7.2.14 and higher - Cloudera on cloud Azure Azure HBase None
COD version 7.2.16.1 and higher - Cloudera on cloud GCP GCP HBase None
*Replication Manager does not support S3 as a source or destination when S3 is configured to use SSE-KMS.

Replication Manager replicates HDFS data (CDH source clusters and HDP source clusters), Hive external tables (CDH source clusters), and HBase data (CDH 6 source clusters) to Cloudera on cloud (Amazon S3 and Microsoft Azure ADLS Gen2 (ABFS)) clusters. Replication Manager from HDP clusters toCloudera on cloud Azure is a beta feature and is not available for general use. You can use alternate methods to replicate Hive external tables and HBase data for scenarios that are not supported by Replication Manager.

The following tables list the minimum CDH and HDP source cluster versions, minimum Cloudera Manager versions, supported cloud providers, and supported scenarios:

Table 2. Replicate data from CDH 5 source clusters
Source cluster Lowest supported source Cloudera Runtime version Lowest supported source Cloudera Manager version Cloud provider Supported services on Replication Manager Services that require alternate replication methods
CDH 5 5.10 6.3.0 Cloud storage in Cloudera on cloud AWS HDFS HBase

To replicate HBase data, see COD replication in a Nutshell, Migrating HBase data, and HBase data replication.

CDH 5 5.10 6.3.0 Data Lake in Cloudera on cloud AWS
  • Sentry to Ranger*
  • Hive external tables
CDH 5 5.10 6.3.4 Cloud storage in Cloudera on cloud Azure HDFS
CDH 5 5.10 6.3.4 Data Lake in Cloudera on cloud Azure
  • Sentry to Ranger*
  • Hive external tables
CDH 5 5.10 7.9.0 Data Hub in Cloudera on cloud AWS/Azure Hive external tables None
*To perform the Sentry policy replication, you must be running the Sentry service on CDH 5.12 or higher, or any CDH 6.x version.
Table 3. Replicate data from CDH 6 source clusters
Source cluster Lowest supported source Cloudera Runtime version Lowest supported source Cloudera Manager version Cloud provider Supported services on Replication Manager Services that require alternate replication methods
CDH 6 6.1 6.3.0 Cloud storage in Cloudera on cloud AWS HDFS HBase

To replicate HBase data, see COD replication in a Nutshell, Migrating HBase data, and HBase data replication.

CDH 6 6.1 6.3.0 Data Lake in Cloudera on cloud AWS
  • Sentry to Ranger*
  • Hive external tables
CDH 6 6.1 7.1.1 / 6.3.4 Cloud storage in Cloudera on cloud Azure HDFS
CDH 6 6.1 7.1.1 / 6.3.4 Data Lake in Cloudera on cloud Azure
  • Sentry to Ranger*
  • Hive external tables
CDH 6 6.1 7.9.0 Data Hub in Cloudera on cloud AWS/Azure
  • Sentry to Ranger*
  • Hive external tables
CDH 6 6.3.3 7.3.1 Data Hub in Cloudera on cloud AWS/Azure HBase None
*To perform the Sentry policy replication, you must be running the Sentry service on CDH 5.12 or higher, or any CDH 6.x version.
Table 4. Replicate data from HDP 2 and HDP 3 source clusters
Lowest supported source HDP version Cloud provider Supported services on Replication Manager Services that require alternate replication methods
HDP 2.6.5* AWS HDFS
HDP 2.6.5* Azure HDFS HBase

To replicate HBase data, see COD replication in a Nutshell and HBase data replication.

HDP 3.1.1*

AWS

Azure

HDFS
*No alternate replication methods are available for HDFS, Ranger, and Atlas replication.

We want your opinion

How can we improve this page?

What kind of feedback do you have?