1. Configuring BlockCache

If you have less than 20 GB of RAM available for use by HBase, consider tailoring the default on-heap BlockCache implementation (LruBlockCache) for your cluster.

If you have more than 20 GB of RAM available, consider adding off-heap BlockCache (BucketCache).

The first few steps are the same for both options:

  • Specify the maximum amount of on-heap RAM to allocate to the HBase RegionServer on each node. The default is 1 GB, which is too small for production.

    To alter the default allocation, set the "RegionServers maximum Java heap size" value (Ambari), or set the HBASE_HEAPSIZE environment variable in hbase-env.sh (manual installation). Specify the value in megabytes. The HBase startup script uses $HBASE_HEAPSIZE to override the default maximum JVM heap size (-Xmx).

    The following example sets the maximum on-heap memory allocation to 20 GB in hbase-env.sh:

        export HBASE_HEAPSIZE=20480

  • Determine (or estimate) the proportions of reads and writes in your workload, and use these proportions to specify on-heap memory for BlockCache and MemStore. The sum of the two allocations must be less than or equal to 0.8. The following table describes the two properties.

    Property

    Default Value

    Description

    hfile.block.cache.size

    0.4

    Proportion of maximum JVM heap size (Java -Xmx setting) to allocate to BlockCache. A value of 0.4 allocates 40% of the maximum heap size.

    hbase.regionserver.global.memstore.upperLimit

    0.4

    Proportion of maximum JVM heap size (Java -Xmx setting) to allocate to MemStore. A value of 0.4 allocates 40% of the maximum heap size.

    Use the following guidelines to determine the two proportions:

    • The default configuration for each property is 0.4, which configures BlockCache for a mixed workload with roughly equal proportions of random reads and writes.

    • If your workload is read-heavy and you do not plan to configure off-heap cache -- your amount of available RAM is less than 20 GB -- increase hfile.block.cache.size and decrease hbase.regionserver.global.memstore.upperLimit so that the values reflect your workload proportions. This will optimize read performance.

    • If your workload is write-heavy, decrease hfile.block.cache.size and increase hbase.regionserver.global.memstore.upperLimit proportionally.

    • As noted earlier, the sum of hfile.block.cache.size and hbase.regionserver.global.memstore.upperLimit must be less than or equal to 0.8 (80%) of the maximum Java heap size specified by HBASE_HEAPSIZE (-Xmx). If you allocate more than 0.8 across both caches, the HBase RegionServer process will return an error and will not start.

    • Do not set hfile.block.cache.size to zero. At a minimum, specify a proportion that allocates enough space for HFile index blocks. To review index block sizes, use the RegionServer Web GUI for each server.

  • Edit the corresponding values in your hbase-site.xml file(s). Here are the default definitions:

    <property>
         <name>hfile.block.cache.size</name>
         <value>0.4</value>
         <description>Percentage of maximum heap (-Xmx setting) to allocate to block
           cache used by HFile/StoreFile. Default of 0.4 allocates 40%.
         </description>
     </property>
     
     <property>
         <name>hbase.regionserver.global.memstore.upperLimit</name>
         <value>0.4</value>
         <description>Maximum size of all memstores in a region server before new
           updates are blocked and flushes are forced. Defaults to 40% of heap.
         </description>
     </property> 
  • If you have less than 20 GB of RAM for use by HBase, you are done with the configuration process. Restart (or rolling restart) your cluster. Check log files for error messages. If you have more than 20 GB of RAM for use by HBase, consider configuring the variables and properties in the next subsection.