Support matrix for Replication Manager on Cloudera Base on premises

Cloudera Base on premises Replication Manager can replicate HDFS directories, Hive external tables, Impala data, Hive ACID tables, Iceberg tables on HDFS and Ozone storage, Ranger policies and roles for HDFS, Hive, and HBase services, and data in Ozone buckets.

The following prerequisites apply for creating replication policies:
  • The Cloudera Base on premises and Cloudera Manager versions of the target cluster must match or be higher than the version of the source cluster.
  • The target database name must be the same as the source database name, otherwise issues can occur during or after data replication.

Supported replication scenarios

  • Kerberos – Replication Manager supports the following replication scenarios when Kerberos authentication is enabled on a cluster:
    • Secure source to a secure target.
    • Insecure source to an insecure target.
    • Insecure source to a secure target. The following requirements must be met for this scenario:
      • When a target cluster has multiple source clusters, all the source clusters must either be secure or insecure. Replication Manager does not support mixing secure and insecure source clusters.
      • The target cluster must use Cloudera Manager 7.x or higher versions.
      • The source cluster must use a compatible Cloudera Manager version.
      • This replication scenario requires additional configuration. For more information, see Replicating from unsecure to secure clusters.
  • Transport Layer Security (TLS) – You can use TLS with Replication Manager. Replication Manager also supports replication scenarios with enable TLS for non-Hadoop services (Hive/Impala) and disabled TLS for Hadoop services (such as HDFS,YARN, and MapReduce).
  • Apache Knox – If Cloudera Manager is configured with Knox and the source and target clusters are Knox-SSO-enabled, you must use the Cloudera Manager port in the peer URL when adding the source and target clusters as peers.
  • FIPS clusters – In Cloudera Base on premises 7.3.1 CHF1 and higher versions using Cloudera Manager 7.13.1.100 and higher versions, you can use the source and target clusters that support FIPS in Replication Manager.
    To use FIPS clusters in Replication Manager, run the following commands on the source cluster after adding the source cluster as a peer to use in replication policies:
    /usr/java/default/bin/keytool -exportcert -keystore /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks -alias CMRootCA-0 -file ./source-cert.txt -storepass $STOREPASS -provider com.safelogic.cryptocomply.jcajce.provider.CryptoComplyFipsProvider -providerpath com-safelogic-cryptocomply-fips-core.jar
    
    usr/java/default/bin/keytool -importcert -noprompt -v -trustcacerts -keystore /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks -alias cmrootca-1 -file ./source-cert.txt --storepass $STOREPASS  -provider com.safelogic.cryptocomply.jcajce.provider.CryptoComplyFipsProvider -providerpath com-safelogic-cryptocomply-fips-core.jar
    The following replication policies support FIPS source and target clusters:
    • Atlas replication policies
    • HDFS replication policies
    • Hive replication policies
    • Hive ACID replication policies
    • Hive external replication policies
    • Iceberg replication policies
    • Ranger replication policies

Replicate from Cloudera Base on premises source clusters

The following tables list the source and target clusters, lowest supported versions of Cloudera Manager, and the services that are available for each supported cloud provider for Cloudera Base on premises source cluster using the same or different storage:

Table 1. Replicate data between Cloudera Base on premises clusters using same storage
Lowest supported source Cloudera Manager version Lowest supported source Cloudera Runtime version Target cluster Supported services on Replication Manager
7.11.3 7.1.9 Cloudera Base on premises
  • HDFS
  • Hive external tables
7.11.3 7.1.9 Cloudera Base on premises
  • Hive ACID tables

    You can also use REPL commands to replicate Hive ACID tables.

  • For Ozone bucket replication, use Cloudera Manager APIs.
7.11.3 7.1.9 Cloudera Base on premises Ozone buckets
7.11.3 7.1.9 Cloudera Base on premises
  • Iceberg tables
  • Ranger policies and roles, and Ranger audit logs in HDFS

    You can also create Ranger replication policies on Kerberos-enabled clusters if the Ranger replication feature flag is enabled.

7.11.3 CHF7 7.1.9 SP1 Cloudera Base on premises Atlas replication policies

Contact your Cloudera account team to enable this feature.

Table 2. Replicate data between Cloudera Base on premises clusters using different storage
Lowest supported source Cloudera Manager version Lowest supported source Cloudera Runtime version Target cluster Supported services on Replication Manager
7.11.3 CHF1 7.1.9 Cloudera Base on premises Replicate HDFS and Hive external tables between the following clusters:
  • From source clusters using Cloudera HDFS to target clusters using Dell Powerscale storage (Powerscale)
  • From source clusters using Powerscale to target clusters using Cloudera HDFS
  • From source clusters using Powerscale to target clusters using Powerscale
  • From source clusters using Powerscale to target clusters using AWS, Azure, or GCP
7.11.3 CHF2 7.1.9 Cloudera Base on premises Replicate Hive ACID tables and Iceberg tables from source clusters using Powerscale to target clusters using Powerscale.
7.11.3 CHF7 7.1.9 SP1 Cloudera Base on premises Replicate metadata only for Ozone storage-backed Hive external tables using Hive external table replication policies. You must replicate the data using Ozone replication policies.
7.13.2 7.3.2 Cloudera Base on premises Replicate Iceberg V1 and V2 tables stored on Ozone buckets using Iceberg replication policies.

Replicate data from on-premises to cloud storage

Replication scenario Lowest supported Cloudera Base on premises version Cloud storage
HDFS and Hive external tables

For Hive external tables, back up the external tables to the cloud and restore to the same cluster.

Cloudera Base on premises 7.1.9 using Cloudera Manager 7.11.3
  • Amazon S3.

    Replication Manager does not support S3 as a source or target cluster when S3 is configured to use SSE-KMS.

  • Microsoft ADLS Gen1 and Microsoft ADLS Gen2 (ABFS)
Snapshot backup Cloudera Base on premises 7.1.9 and Cloudera Manager 7.11.3
  • Amazon S3
  • Microsoft ADLS Gen1 and Microsoft ADLS Gen2 (ABFS)
HDFS and Hive external tables Cloudera Base on premises 7.11.3 CHF3 and higher clusters using Dell Powerscale storage AWS, Azure, and GCP
HDFS and Hive external tables Cloudera Base on premises 7.1.9 SP1 GCP

To view all supported clusters and features, including earlier and end of support (EOS) versions, see the Replication Manager support matrix.