Verify cluster requirements

Before you create a replication policy, you must verify whether the cluster requirements are met.

The following table lists the features and the steps to complete before you create a replication policy in Replication Manager:
Feature/functionality Verify whether the following steps are complete
To use source on-premises clusters Is the source CDH cluster / source CDP Private Cloud cluster registered on the Management Console?

For more information, see Add a CDH cluster and Adding a CDP Private Cloud Base cluster.

To use AWS or ADLS as target clusters Do you have valid cloud credentials to access and use AWS or ADLS as target clusters?

For more information, see Working with Cloud Credentials.

To verify cluster access
  • Do you have the required access to create replication policies?

    Power users, the user who onboarded the source and target clusters, and users with ClassicClusterAdmin or ClassicClusterUser resource roles can create replication polices on clusters for which they have access.

    For more information, see Understanding account roles and resource roles.

  • Do you have the required access to view the replication policies?
    • HDFS replication policies - The existing policies are visible to users who have access to the source cluster in the policy.
    • Hive and HBase replication policies - The existing policies are visible to Replication Manager users if they have access to the destination cluster in the policy. A warning appears if they do not have access to the source cluster.

      If you can view the policies, you can perform other actions on the policy which includes policy update and policy delete operations.

To replicate data securely Have you configured an SSL/TLS certificate exchange between two Cloudera Manager instances that manage source and target clusters respectively?

For more information, see Configuring SSL/TLS certificate exchange between two Cloudera Manager instances.

To create HDFS replication policies
  1. Do the source cluster and target cluster meet the requirements specified in the Support Matrix?

  2. Is the source on-premises cluster registered as a classic cluster?

  3. Do you have the required cluster access to create or view replication policies?

  4. Do you have the required cloud credentials to access and use AWS or ADLS (target cluster)?

  5. Is an external account available in the Cloudera Manager instance that has access to the bucket or container where the HDFS data is being copied to?

  6. Have you configured an SSL/TLS certificate exchange between two Cloudera Manager instances that manage source and target clusters respectively to replicate data securely?

To create Hive replication policies
  1. Do the source cluster and target cluster meet the requirements specified in the Support Matrix?

  2. Is the source on-premises cluster registered as a classic cluster?

  3. Do you have the required cloud credentials to access and use AWS or ADLS (target cluster)?

  4. Do you have the required cluster access to create or view replication policies?

  5. Is an external account configured on the source CDH cluster's Cloudera Manager that allows the CDH cluster to access CDP cloud storage?
  6. Have you configured an SSL/TLS certificate exchange between two Cloudera Manager instances that manage source and target clusters respectively to replicate data securely?

To create HBase replication policies
  1. Do the source cluster and target cluster meet the requirements specified in the Support Matrix?

  2. Is the source on-premises cluster registered as a classic cluster?

  3. Are the following steps complete on the CDP Private Cloud Base source cluster or CDH source cluster (these steps are not required for COD sources)?
    1. Have you installed the HBase replication plugin parcel in the CDH source clusters?

      Applicable for CDH versions 7.2.x that are lower than 7.2.2, versions 7.1.x that are lower than 7.1.5, and for versions lower than 7.x. For more information, see Cloudera Replication Plugin.

    2. Have you created the /user/hbase folder for the hbase user in HDFS in the source cluster?

      Applicable for Cloudera Manager versions 7.4.3 or lower.

      These commands allow the HBase replication policy to replicate the existing data in the source cluster.

  4. Do you have the required cloud credentials to access and use AWS or ADLS (target cluster)?

  5. Have you assigned the managed identity of source roles, Storage Blob Data Owner or Storage Blob Data Contributor, to the destination storage data container and vice versa for bidirectional replication when you are using COD on Microsoft Azure?

    The roles allow writing a snapshot in the destination cluster container.

  6. Do you have the required cluster access to create or view replication policies?

  7. Is the required target cluster (Data Hub or COD) available and healthy?

    For more information, see Data Hub and COD.

  8. Are the required ports including ports 2181 and 16020 on the destination hosts of the AWS cluster or ADLS cluster (target cluster), and the Cloudera Manager server port on the source cluster open and available?

    Verify whether the ports 16020 for worker security group and 2181 for worker, master, and leader groups are open for connection from the source cluster to the destination cluster on AWS or Azure. This ensures that the source HBase service can communicate with Zookeeper and HBase services on the destination hosts uninterruptedly. For more information, see Ports for HBase replication.

  9. Does DNS resolution work as expected between the source and destination clusters?
  10. Have you configured an SSL/TLS certificate exchange between two Cloudera Manager instances that manage source and target clusters respectively to replicate data securely?

You can also use CDP CLI commands for HDFS and Hive replication policies. The CDP CLI commands for Replication Manager are under the replicationmanager CDP CLI option. For more information, see CDP CLI for Replication Manager.