Troubleshooting replication policies in CDP Public Cloud
The troubleshooting scenarios in this topic help you to troubleshoot the replication policy jobs in Replication Manager Service in CDP Public Cloud.
What do I do if the source cluster hosts for HBase replication do not find the destination clusters if I ping by their host names?
This might occur for on-premises clusters such as CDP Private Cloud Base clusters or CDH clusters because the source clusters are not on the same network as the destination Data Hub. Therefore, hostnames cannot be resolved by the DNS service on the source cluster.
To resolve this issue, add the destination Region Server and Zookeeper IP to host name mappings in the /etc/hosts files of all the Region Servers on the source cluster.
10.115.74.181 dx-7548-worker2.dx-hbas.x2-8y.dev.dr.work 10.115.72.28 dx-7548-worker1.dx-hbas.x2-8y.dev.dr.work 10.115.73.231 dx-7548-worker0.dx-hbas.x2-8y.dev.dr.work 10.115.72.20 dx-7548-master1.dx-hbas.x2-8y.dev.dr.work 10.115.74.156 dx-7548-master0.dx-hbas.x2-8y.dev.dr.work 10.115.72.70 dx-7548-leader0.dx-hbas.x2-8y.dev.dr.work
When a replication policy job fails, how do I debug the issue?
You can choose one of the following methods to identify the errors to troubleshoot a job failure:
- On the Replication Policies page, click the failed job in
the Job History pane. The errors for the failed job
appear. The following sample image shows the Job History pane for a replication policy job:
- In the source and target Cloudera Manager, click Running
Commands on the left hand navigation bar. The recent command
history shows the failing commands.The following sample image shows the Running Commands page for an HBase replication policy:
- On the source cluster and target cluster, open the HBase service logs to track
You can also search on thepage to view the logs.
An HBase replication policy fails for a COD on Microsoft Azure when I choose the Perform Initial Snapshot option. However, the replication of data is successful when I do not choose the option. How do I resolve this issue?
Before you replicate HBase data with the Perform Initial Snapshot option, ensure that you assign the managed identity of source roles, Storage Blob Data Owner or Storage Blob Data Contributor, to the destination storage data container and vice versa for bidirectional replication. The roles allow writing a snapshot in the destination cluster container.