You can use the CldrCopyTable utility to copy data from one cluster to another. You
can use it to bring data in sync for replication.
CldrCopyTable is Cloudera’s version of the upstream CopyTable utility.
The --cldr.cross.domain
option of CldrCopyTable enables you to copy
data cross-realm.
-
Ensure that the following properties have the correct values in the
hbase-site.xml
configuration file of the target
cluster:
<property>
<name>hbase.security.replication.credential.provider.path</name>
<value>cdprepjceks://hdfs@[***NAMENODE_HOST***]:[***NAMENODE_PORT***]/hbase-replication/credentials.jceks</value>
</property>
<property>
<name>hbase.security.replication.user.name</name>
<value>srv_[***WORKLOAD USER NAME***]</value>
</property>
-
Ensure that the source cluster can communicate with the target cluster:
-
Get the ZooKeeper quorum address of the target cluster.
-
Set the address as an environment parameter in the source
cluster.
-
Set a subnet that allows connection from the source cluster.
For example by enabling the port 2181 for the ZooKeeper client.
-
Issue the
CldrCopyTable
command from the source cluster to
write to the target cluster.
Based on your target cluster setup you have to use either the
--cldr.cross.domain
or the
--cldr.unsecure.peer
option.
Use the
--cldr.cross.domain
option:
hbase org.apache.hadoop.hbase.mapreduce.CldrCopyTable --cldr.cross.domain --peer.adr=[***ZOOKEEPER QUORUM***]:[***ZOOKEEPER PORT***]:[***ZOOKEEPER ROOT FOR HBASE***] --new.name="[***NEW TABLE NAME***]" "[***SOURCE TABLE NAME***]"
Use the
--cldr.unsecure.peer
option:
hbase.org.apache.hadoop.hbase.mapreduce.CldrCopyTable --cldr.unsecure.peer --peer.adr=[***ZOOKEEPER QUORUM***]:[***ZOOKEEPER PORT***]:[***ZOOKEEPER ROOT FOR HBASE***] --new.name="[***NEW TABLE NAME***]" "[***SOURCE TABLE NAME***]"
-
Once the job is finished, check the target cluster and ensure that the copy was
successful.