Initiate replication when data already exist

You can initiate replication when data already exist by taking advantage of the accumulation that happens when a replication peer is disabled.

You may need to start replication from some point in the past. For example, suppose you have a primary HBase cluster in one location and are setting up a disaster-recovery (DR) cluster in another. To initialize the DR cluster, you need to copy over the existing data from the primary to the DR cluster, so that when you need to switch to the DR cluster you have a full copy of the data generated by the primary cluster. Once that is done, replication of new data can proceed as normal.

  1. Start replication.
  2. Add the destination cluster as a peer.
  3. Immediately disable it using disable_peer.
  4. Take a snapshot of the table on the source cluster and export it.
    The snapshot command flushes the table from memory.
  5. Import and restore the snapshot on the destination cluster.
  6. Run enable_peer to re-enable the destination cluster.