Optimize HBase replication policy performance when replicating HBase tables with several
TB data
Can HBase replication policy performance be optimized when replicating HBase tables
with several TB of data if the "Perform Initial Snapshot" option is chosen during HBase
replication policy creation?
Complete the following manual steps to optimize HBase replication policy
performance when replicating several TB of HBase data if you choose the
Perform Initial Snapshot option during the HBase
replication policy creation process.
Remedy
Before you create the HBase replication policy, perform the following
steps:
Navigate to the source Cloudera Manager > YARN service > Configuration tab.
Search for the mapreduce.task.timeout
parameter.
Increase the value or set it to 0 to switch off the timeout.
Restart the YARN service.
Navigate to the source Cloudera Manager > HBase service > Configuration tab.
Search and configure the following key-value pairs:
Perform steps e through
g on the target Cloudera
Manager.
When you create the HBase replication policy for the first time using
the above configured source cluster, you must increase the
Maximum Map Slots value to a higher number on
the Advanced Settings page.
If Store File Tracking (SFT) is enabled in the target COD, perform the
steps mentioned in the COD migration topic after the
replication policy creation is complete.