Spreading Queue Failover Load
Set configuration parameters to maintain an even distribution of replication activity over the servers in the cluster.
When replication is active, a subset of RegionServers in the source cluster is responsible for shipping edits to the sink. This responsibility must be failed over like all other RegionServer functions if a process or node crashes. The following configuration settings are recommended for maintaining an even distribution of replication activity over the remaining live servers in the source cluster:
- Set
replication.source.maxretriesmultiplier
to300
. - Set
replication.source.sleepforretries
to1
(1 second). This value, combined with the value ofreplication.source.maxretriesmultiplier
, causes the retry cycle to last about 5 minutes. - Set
replication.sleep.before.failover
to30000
(30 seconds) in the source cluster site configuration.