Troubleshooting replication policies in CDP Public Cloud

The troubleshooting scenarios in this topic help you to troubleshoot the replication policy jobs in Replication Manager Service in CDP Public Cloud.

What do I do if the source cluster hosts for HBase replication do not find the destination clusters if I ping by their host names?

This might occur for on-premises clusters such as CDP Private Cloud Base clusters or CDH clusters because the source clusters are not on the same network as the destination Data Hub. Therefore, hostnames cannot be resolved by the DNS service on the source cluster.

To resolve this issue, add the destination Region Server and Zookeeper IP to host name mappings in the /etc/hosts files of all the Region Servers on the source cluster.

The following snippet shows the contents in a sample /etc/hosts file:
10.115.74.181 dx-7548-worker2.dx-hbas.x2-8y.dev.dr.work
10.115.72.28 dx-7548-worker1.dx-hbas.x2-8y.dev.dr.work
10.115.73.231 dx-7548-worker0.dx-hbas.x2-8y.dev.dr.work
10.115.72.20 dx-7548-master1.dx-hbas.x2-8y.dev.dr.work
10.115.74.156 dx-7548-master0.dx-hbas.x2-8y.dev.dr.work
10.115.72.70 dx-7548-leader0.dx-hbas.x2-8y.dev.dr.work

When a replication policy job fails, how do I debug the issue?

You can choose one of the following methods to identify the errors to troubleshoot a job failure:

  • On the Replication Policies page, click the failed job in the Job History pane. The errors for the failed job appear.
    The following sample image shows the Job History pane for a replication policy job:
  • In the source and target Cloudera Manager, click Running Commands on the left hand navigation bar. The recent command history shows the failing commands.
    The following sample image shows the Running Commands page for an HBase replication policy:
  • On the source cluster and target cluster, open the HBase service logs to track the errors.

    You can also search on the Cloudera Manager > Diagnostics > Logs page to view the logs.