Support matrix for Cloudera Replication Manager
You can use Replication Manager or other alternate replication methods to replicate HDFS, Hive external tables, and HBase data between on-premises clusters (CDH clusters, Cloudera Private Cloud Base clusters, HDP clusters) and Cloudera Public Cloud (Amazon S3 (AWS), Microsoft Azure ADLS Gen2 (ABFS), and Google Cloud Platform (GCP)) clusters. Replication Manager from HDP clusters to Cloudera Public Cloud Azure is a beta feature and is not available for general use.
List of features supported by Cloudera Replication Manager
Replication Manager provides replication policies that you can create, edit, and manage to accomplish your data replication goals. You can use other alternate replication methods for scenarios that Replication Manager does not support. Certain features in Cloudera Replication Manager are available only if the source and target clusters' Cloudera Manager versions support the feature. Verify whether your source and target cluster's Cloudera Manager version support the required feature.
Supported replication policies
- HDFS replication policies
- Replicate HDFS data and metadata from:
- on-premises clusters (CDH, Cloudera Private Cloud Base, and HDP) to cloud storage.
- cloud storage to classic clusters (CDH or Cloudera Private Cloud Base clusters).
- Hive replication policies
- Support table-level replication, and can replicate Hive external tables from on-premises
clusters (CDH and Cloudera Private Cloud Base) to cloud storage and
to Data Hubs. The replication policies can also:
- replicate data stored in Hive tables, Hive metadata, data in Hive metastore, and Impala metadata (catalog server metadata) associated with Impala tables registered in the Hive metastore, and
- migrate Sentry permissions to Ranger.
- HBase replication policies
- Replicate HBase data from a source classic cluster (CDH or Cloudera Private Cloud Basecluster), COD, or Data Hub to a target Data Hub or COD cluster. You can also copy or replicate HBase data between different environments within a Virtual Private Cloud (VPC) using these policies.
- Use the replication plugin for HBase data. For more information, see COD replication in a Nutshell, Cloudera replication plugin, and HBase data replication.
- Contact Cloudera Support for Hive external tables.
Supported features
Feature | Lowest supported source Cloudera Manager version | Lowest supported target Cloudera Manager version |
---|---|---|
Register the GCP credentials to use in Replication Manager on the Cloud Credentials page. |
|
Supports all Cloudera Public Cloud Cloudera Manager versions. |
Replicate HBase data simultaneously between multiple clusters*. |
|
|
Replicate only those HBase tables where the replication scope is already enabled using the | * option during the HBase replication policy creation process.Supports all Cloudera Public Cloud Cloudera Manager versions. |
|
Specify the network load balancer (NLB) Endpoint after you enable the option during the HBase replication policy creation process if the on-premises cluster uses NLB to communicate with the COD clusters. | CDH 5.16.2 | 7.12.0.100 |
Specify the YARN queue bandwidth using the | * option during the HBase replication policy creation process to export the HBase initial snapshot.
|
|
Enter If you do not enter any value, Replication Manager chooses an appropriate value, depending on the resources in the source and target cluster, to optimize the performance. |
to specify the maximum number of tables to process in parallel during the
initial snapshot export and import step for an HBase replication policy. Supports all Cloudera Public Cloud Cloudera Manager versions. |
|
Add IDBroker credentials* to use in Replication Manager on the Cloud Credentials page. | 7.11.3 CHF7 | 7.11.3 CHF7 |
Enter the | * field during the HBase replication policy creation process to specify the username to export the initial snapshot to the target.7.11.3 CHF7 | 7.11.3 CHF7 |
*To enable this feature, contact your Cloudera Account team. |
Replicate data from Cloudera Private Cloud Base and Cloudera Public Cloud source clusters
Replication Manager replicates HDFS (Cloudera Private Cloud Base source clusters and Cloudera Public Cloud storage on AWS and Azure), Hive external tables (Cloudera Private Cloud Base source clusters), and HBase (Cloudera Private Cloud Base source clusters) data to Cloudera Public Cloud (Amazon S3 and Microsoft Azure ADLS Gen2 (ABFS)) clusters. You can use the replication plugin as an alternate replication method to replicate HBase data for scenarios that are not supported by Replication Manager.
The following tables list the minimum source and destination cluster versions, minimum Cloudera Manager versions, supported cloud providers, and supported scenarios:
Replicate data from Cloudera Private Cloud Base source clusters
Source cluster | Lowest supported source Cloudera Manager version | Lowest supported source Cloudera Runtime version | Cloud provider | Supported services on Replication Manager | Services that require alternate replication methods |
---|---|---|---|---|---|
Cloudera Private Cloud Base | 7.1.1 | 7.1.1 | Cloudera Public Cloud AWS/Azure | HDFS | HBase To replicate HBase data, see COD replication in a Nutshell and HBase data replication. |
Cloudera Private Cloud Base | 7.1.1 | 7.1.1 | Data Lake in Cloudera Public Cloud AWS/Azure | Hive external tables | |
Cloudera Private Cloud Base | 7.9.0 | 7.1.1 | Data Hub in Cloudera Public Cloud AWS/Azure | Hive external tables | None |
Cloudera Private Cloud Base | 7.3.1 | 7.1.6 | Data Hub in Cloudera Public CloudAWS/Azure | HBase | None |
Cloudera Private Cloud Base | 7.11.3 CHF7 | 7.1.9 SP1 | Cloudera Public Cloud GCP | HDFS, Hive external tables, HBase | None |
Replicate data from Cloudera Public Cloud source clusters
- Replication across cross-cloud providers, that is from AWS to Azure and vice-versa is not supported.
- The source and target clusters must use the same account.
Source cluster | Destination cluster | Supported services on Replication Manager | Services that require alternate replication methods |
---|---|---|---|
Cloudera Public Cloud AWS* / Azure | CDH 5.x CDH 6.x HDP 2.x HDP 3.x |
Not applicable | HBase To replicate HBase data, see COD replication in a Nutshell and HBase data replication. |
Cloudera Public CloudAWS* | CDH 5.9.0 and higher Cloudera Private Cloud Base 7.1.7 SP1 and higher |
HDFS | None |
Cloudera Public Cloud Azure | CDH 6.1.0 and higher Cloudera Private Cloud Base 7.1.7 SP1 and higher |
HDFS | None |
Cloudera Public Cloud GCP 7.2.18 and higher | Cloudera Private Cloud Base 7.1.9 SP1 and higher | HDFS | None |
COD version 7.2.14 and higher - Cloudera Public Cloud AWS | AWS | HBase | None |
COD version 7.2.14 and higher - Cloudera Public Cloud Azure | Azure | HBase | None |
COD version 7.2.16.1 and higher - Cloudera Public Cloud GCP | GCP | HBase | None |
*Replication Manager does not support S3 as a source or destination when S3 is configured to use SSE-KMS. |
Replicate data from CDH and HDP source clusters
Replication Manager replicates HDFS data (CDH source clusters and HDP source clusters), Hive external tables (CDH source clusters), and HBase data (CDH 6 source clusters) to Cloudera Public Cloud (Amazon S3 and Microsoft Azure ADLS Gen2 (ABFS)) clusters. Replication Manager from HDP clusters toCloudera Public Cloud Azure is a beta feature and is not available for general use. You can use alternate methods to replicate Hive external tables and HBase data for scenarios that are not supported by Replication Manager.
The following tables list the minimum CDH and HDP source cluster versions, minimum Cloudera Manager versions, supported cloud providers, and supported scenarios:
Source cluster | Lowest supported source Cloudera Runtime version | Lowest supported source Cloudera Manager version | Cloud provider | Supported services on Replication Manager | Services that require alternate replication methods |
---|---|---|---|---|---|
CDH 5 | 5.10 | 6.3.0 | Cloud storage in Cloudera Public Cloud AWS | HDFS | HBase To replicate HBase data, see COD replication in a Nutshell, Migrating HBase data, and HBase data replication. |
CDH 5 | 5.10 | 6.3.0 | Data Lake in Cloudera Public Cloud AWS |
|
|
CDH 5 | 5.10 | 6.3.4 | Cloud storage in Cloudera Public Cloud Azure | HDFS | |
CDH 5 | 5.10 | 6.3.4 | Data Lake in Cloudera Public Cloud Azure |
|
|
CDH 5 | 5.10 | 7.9.0 | Data Hub in Cloudera Public Cloud AWS/Azure | Hive external tables | None |
*To perform the Sentry policy replication, you must be running the Sentry service on CDH 5.12 or higher, or any CDH 6.x version. |
Source cluster | Lowest supported source Cloudera Runtime version | Lowest supported source Cloudera Manager version | Cloud provider | Supported services on Replication Manager | Services that require alternate replication methods |
---|---|---|---|---|---|
CDH 6 | 6.1 | 6.3.0 | Cloud storage in Cloudera Public Cloud AWS | HDFS | HBase To replicate HBase data, see COD replication in a Nutshell, Migrating HBase data, and HBase data replication. |
CDH 6 | 6.1 | 6.3.0 | Data Lake in Cloudera Public Cloud AWS |
|
|
CDH 6 | 6.1 | 7.1.1 / 6.3.4 | Cloud storage in Cloudera Public Cloud Azure | HDFS | |
CDH 6 | 6.1 | 7.1.1 / 6.3.4 | Data Lake in Cloudera Public Cloud Azure |
|
|
CDH 6 | 6.1 | 7.9.0 | Data Hub in Cloudera Public Cloud AWS/Azure |
|
|
CDH 6 | 6.3.3 | 7.3.1 | Data Hub in Cloudera Public Cloud AWS/Azure | HBase | None |
*To perform the Sentry policy replication, you must be running the Sentry service on CDH 5.12 or higher, or any CDH 6.x version. |
Lowest supported source HDP version | Cloud provider | Supported services on Replication Manager | Services that require alternate replication methods |
---|---|---|---|
HDP 2.6.5* | AWS | HDFS |
|
HDP 2.6.5* | Azure | HDFS | HBase To replicate HBase data, see COD replication in a Nutshell and HBase data replication. |
HDP 3.1.1* |
AWS Azure |
HDFS |
|
*No alternate replication methods are available for HDFS, Ranger, and Atlas replication. |