Configuring HA Reads for HBase
To enable High Availability for HBase reads, specify the server-side and client-side
configuration properties in your hbase-site.xml
configuration file, and
then restart the HBase Master and RegionServers.
-
Set the server-side properties in your
hbase-site.xml
configuration file for all servers in your HBase cluster that use region replicas. The following table describes server-side properties.Table 1. Server-Side Configuration Properties for HBase HA Property Example value Description hbase.regionserver. storefile.refresh.period
30000
Specifies the period (in milliseconds) for refreshing the store files for secondary regions. The default value is 0, which indicates that the feature is disabled. Secondary regions receive new files from the primary region after the secondary replica refreshes the list of files in the region.
Note: Too-frequent refreshes might cause extra Namenode pressure. If files cannot be refreshed for longer than HFile TTL, specified with
hbase.master.hfilecleaner.ttl
, the requests are rejected.Refresh period should be a non-zero number if META replicas are enabled (see
hbase.meta.replica.count
).If you specify refresh period, we recommend configuring HFile TTL to a larger value than its default.
hbase.region.replica. replication.enabled
true
Determines whether asynchronous WAL replication is enabled or not. The value can be true or false. The default is false.
If this property is enabled, a replication peer named
region_replica_replication
is created. The replication peer replicates changes to region replicas for any tables that have region replication set to 1 or more.After enabling this property, disabling it requires setting it to false and disabling the replication peer using the shell or the
ReplicationAdmin
java class. When replication is explicitly disabled and then re-enabled, you must sethbase.replication
to true.hbase.master. hfilecleaner.ttl
3600000
Specifies the period (in milliseconds) to keep store files in the archive folder before deleting them from the file system.
hbase.master. loadbalancer.class
org.apache.hadoop.hbase. master.balancer. StochasticLoadBalancer
Specifies the Java class used for balancing the load of all HBase clients.
The default value is
org.apache.hadoop.hbase. master.balancer. StochasticLoadBalancer
, which is the only load balancer that supports reading data from RegionServers in secondary mode.hbase.meta.replica.count
3
Region replication count for the meta regions. The default value is 1.
hbase.regionserver. meta.storefile.refresh.period
30000
Specifies the period in milliseconds for refreshing the store files for the HBase META tables secondary regions. If this is set to
0
, the feature is disabled.When the secondary region refreshes the list of files in the region, the secondary regions see new files that are flushed and compacted from the primary region. There is no notification mechanism.
Note: If the secondary region is refreshed too frequently, it may cause Namenode pressure. Requests are rejected if the files cannot be refreshed for longer than HFile TTL, which is specified with
hbase.master.hfilecleaner.ttl
. Configuring HFile TTL to a larger value is recommended with this setting.If META replicas are enabled, set this to a non-zero number by setting
hbase.meta.replica.count
to a value greater than1
.hbase.region.replica.wait. for.primary.flush
true
Specifies whether to wait for a full flush cycle from the primary before starting to serve data in a secondary replica.
Disabling this feature might cause secondary replicas to read stale data when a region is transitioning to another RegionServer.
hbase.region.replica. storefile.refresh. memstore.multiplier
4
Multiplier for a “store file refresh” operation for the secondary region replica.
This multiplier is used to refresh a secondary region instead of flushing a primary region. The default value (4) configures the file refresh so that the biggest secondary region replica is 4 times bigger than the biggest primary region.
Disabling this feature is not recommended. However, if you want to do so, set this property to a large value.
-
Set the client-side properties in your
hbase-site.xml
configuration file for all clients, applications, and servers in your HBase cluster that use region replicas. The following table lists client-side properties.Table 2. Client-Side Properties for HBase HA Property Example value Description hbase.ipc.client. specificThreadForWriting
true
Specifies whether to enable interruption of RPC threads at the client side. This is required for region replicas with fallback RPC’s to secondary regions.
hbase.client. primaryCallTimeout.get
10000
Specifies the timeout (in microseconds) before secondary fallback RPC’s are submitted for get requests with
Consistency.TIMELINE
to the secondary replicas of the regions. The default value is 10ms.Setting this to a smaller value increases the number of RPC’s, but lowers 99th-percentile latencies.
hbase.client. primaryCallTimeout. multiget
10000
Specifies the timeout (in microseconds) before secondary fallback RPC’s are submitted for multi-get requests (
HTable.get(List<Get>)
) withConsistency.TIMELINE
to the secondary replicas of the regions. The default value is 10ms.Setting this to a smaller value increases the number of RPC’s, but lowers 99th-percentile latencies.
hbase.client. primaryCallTimeout.scan
1000000
Specifies the timeout (in microseconds) before secondary fallback RPC’s are submitted for scan requests with
Consistency.TIMELINE
to the secondary replicas of the regions. The default value is 1 second.Setting this to a smaller value increases the number of RPC’s, but lowers 99th-percentile latencies.
hbase.meta.replicas.use
true
Specifies whether to use META table replicas or not. The default value is false.
- Restart the HBase Master and RegionServers.