BucketCache IO engine

Use the hbase.bucketcache.ioengine parameter to define where to store the content of the BucketCache. Its value can be offheap, file:PATH, mmap:PATH, pmem:PATH , or it can be empty. By default it is empty which means that BucketCache is disabled.

You can set the following values in the hbase.bucketcache.ioengine parameter to define where to store the BucketCache:

  • offheap: When hbase.bucketcache.ioengine is set to offheap the content of the BucketCache is stored off-heap.
  • file:PATH: When hbase.bucketcache.ioengine is set to file:PATH, the BucketCache uses file caching.
  • mmap:PATH: When hbase.bucketcache.ioengine is set to mmap:PATH, the content of the BucketCache is stored and accessed through memory mapping to a file under the specified path.
  • pmem:PATH: When hbase.bucketcache.ioengine is set to pmem:PATH, BucketCache uses direct memory access to and from a file on the specified path. The specified path must be under a volume that is mounted on a persistent memory device that supports direct access to its own address space.

    The advantage of the pmem engine over the mmap engine is that it supports large cache size. That is because pmem allows for reads straight from the device address, which means in this mode no copy is created on DRAM. Therefore, swapping due to DRAM free memory exhaustion is not an issue when large cache size is specified. With devices currently available, the bucket cache size can be set to the order of hundreds of GBs or even a few TBs.

    When bucket cache size is set to larger than 256GB, the OS limit must be increased, which can be configured by the max_map_count property. Make sure you have an extra 10% for other processes on the host that require the use of memory mapping. This additional overhead depends on the load of processes running on the RS hosts. To calculate the OS limit divide the block cache size in GB by 4 MB and then multiply it by 1.1: (block cache size in GB / 4 MB) * 1.1.