Chapter 9. Highly Available Reads with HBase

HDP 2.1.1 enables HBase administrators to configure their HBase clusters with read-only High Availability, or HA. This feature greatly benefits HBase applications that require low latency queries but can tolerate potentially stale data, or those applications that work with table data which isn't frequently updated, such as remote sensor data. HBase provides read-only HA on a per-table basis by replicating table regions on multiple Region Servers. See Primary and Secondary Regions for more information.

HA for HBase features the following functionality:

  • Data safely protected in HDFS

  • Failed nodes are automatically recovered

  • No single point of failure

HBase administrators should carefully consider the costs associated with using secondary regions, including increased usage of the memstore and block cache, as well as increased network traffic.

[Important]Important

This release of HA for HBase is not compatible with region splits and merges. Do not execute region merges on tables with region replicas. Rather, HBase administrators must pre-split tables before enabling HA for those tables and disable region splits with the DisabledRegionSplitPolicy. This can be done with both the HBase API and with the hbase.regionserver.region.split.policy property in the region server's hbase-site.xml configuration file. This default value can be overridden for individual HBase tables.