Choosing Region Servers to Replicate to
When a master cluster Region Server initiates a replication source to a slave cluster,
it first connects to the ZooKeeper ensemble of the slave using the provided cluster key.
Then it scans the rs/
directory to discover all the available 'sinks' (Region
Servers that are accepting incoming streams of edits to replicate) and randomly chooses a
subset of them using a configured ratio which has a default value of 10 per cent. For
example, if a slave cluster has 150 servers, 15 are chosen as potential recipients for
edits sent by the master cluster Region Server. Because this selection is performed by
each master Region Server, the probability that all slave Region Servers are used is very
high. This method works for clusters of any size. For example, a master cluster of 10
servers replicating to a slave cluster of 5 servers with a ratio of 10 per cent causes the
master cluster Region Servers to choose one server each at random.
A ZooKeeper watcher is placed on the ${zookeeper.znode.parent}/rs
node of
the slave cluster by each of the master cluster Region Servers. This watcher monitors
changes in the composition of the slave cluster. When nodes are removed from the slave
cluster, or if nodes go down or come back up, the master cluster Region Servers respond by
selecting a new pool of slave Region Servers to which to replicate.