Using HashTable and SyncTable tools to copy data between Cloudera Operational Database clusters
You can use the HashTable and SyncTable tools to copy data from one Cloudera Operational Database cluster to another. You can use these tools to synchronize data prior to replication.
You can use the HashTable and SyncTable CLI tools as a one way synchronization method for data in Cloudera Operational Database clusters. The CldrSyncTable job is an extension of the upstream SyncTable tool. For more information about the HashTable and SyncTable tools, see Use HashTable and SyncTable tool.
When you use these tools, ensure that you place the HashTable output directory and
the source table at the same location where the CLI exists. This means that you
cannot set the sourcezkcluster
and the
sourcehashdir
properties to a remote cluster that the
command-line executor cannot authenticate.
- Ensure that all RegionServers and DataNodes on the source cluster are accessible by NodeManagers on the target cluster where SyncTable job tasks are running.
- In case of secured clusters, users on the target cluster who execute the
SyncTable job must be able to do the following on the HDFS and HBase
services of the source cluster:
- Authenticate: for example, using the centralized authentication or cross-realm setup.
- Be authorized: having at least read permission.
- Ensure that the target table is created and enabled on the target cluster.