Using the CldrCopyTable utility to copy data

You can use the CldrCopyTable utility to copy data from one cluster to another. You can use it to bring data in sync for replication.

CldrCopyTable is Cloudera’s version of the upstream CopyTable utility.

The --cldr.cross.domain option of CldrCopyTable enables you to copy data cross-realm.

  1. Ensure that the following properties have the correct values in the hbase-site.xml configuration file of the target cluster:
    <property>
                            <name>hbase.security.replication.credential.provider.path</name>
                            <value>cdprepjceks://hdfs@[***NAMENODE_HOST***]:[***NAMENODE_PORT***]/hbase-replication/credentials.jceks</value>
                            </property>
                            
                            <property>
                            <name>hbase.security.replication.user.name</name>
                            <value>srv_[***WORKLOAD USER NAME***]</value>
                            </property>
                        
  2. Ensure that the source cluster can communicate with the target cluster:
    1. Get the ZooKeeper quorum address of the target cluster.
    2. Set the address as an environment parameter in the source cluster.
    3. Set a subnet that allows connection from the source cluster.
      For exmple by enabling the port 2181 for the ZooKeeper client.
  3. Issue the CldrCopyTable command from the source cluster to write to the target cluster.
    Based on your target cluster setup you have to use either the --cldr.cross.domain or the --cldr.unsecure.peer option.
    Use the --cldr.cross.domain option:
    hbase org.apache.hadoop.hbase.mapreduce.CldrCopyTable --cldr.cross.domain --peer.adr=[***ZOOKEEPER QUORUM***]:[***ZOOKEEPER PORT***]:[***ZOOKEEPER ROOT FOR HBASE***] --new.name="[***NEW TABLE NAME***]" "[***SOURCE TABLE NAME***]"
    Use the --cldr.unsecure.peer option:
    hbase.org.apache.hadoop.hbase.mapreduce.CldrCopyTable --cldr.unsecure.peer --peer.adr=[***ZOOKEEPER QUORUM***]:[***ZOOKEEPER PORT***]:[***ZOOKEEPER ROOT FOR HBASE***] --new.name="[***NEW TABLE NAME***]" "[***SOURCE TABLE NAME***]"
  4. Once the job is finished, check the target cluster and ensure that the copy was successful.