Verify that replication works
Confirm data has been replicated from a source cluster to a remote destination cluster.
Install and configure YARN on the source cluster.
If YARN cannot be used in the source cluster, configure YARN on the destination cluster to verify replication.
If neither the source nor the destination clusters can have YARN installed, you can configure the tool to use local mode; however, performance and consistency could be negatively impacted.
Ensure that you have the required permissions:
- You have sudo permissions to run commands as the hbase user, or a user with admin permissions on both clusters.
- You are an hbase user configured for submitting jobs with YARN.
src-node$ sudo -u hbase hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication peer1 table1 ... org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication$Verifier$Counters BADROWS=2 CONTENT_DIFFERENT_ROWS=1 GOODROWS=1 ONLY_IN_PEER_TABLE_ROWS=1 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=0The following table describes the
Table 1. VerifyReplication Counters Counter Description
Number of rows. On both clusters, and all values are the same.
The key is the same on both source and destination clusters for a row, but the value differs.
Rows that are only present in the source cluster but not in the destination cluster.
Rows that are only present in the destination cluster but not in the source cluster.
Total number of rows that differ from the source and destination clusters; the sum of
VerifyReplicationcompares the entire content of
table1on the source cluster against
table1on the destination cluster that is configured to use the replication peer
peer1.Use the following options to define the period of time, versions, or column families
Table 2. VerifyReplication Counters Option Description
Beginning of the time range, in milliseconds. Time range is forever if no end time is defined.
End of the time range, in milliseconds.
Number of cell versions to verify.
Families to copy; separated by commas.The following example, verifies replication only for rows with a timestamp range of one day:
src-node$ sudo -u hbase hbase org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication --starttime=1472499077000 --endtime=1472585477000 --families=c1 peer1 table1