Size the BlockCache
When you use the LruBlockCache
, the
blocks needed to satisfy each read are cached, old blocks are evicted to make room for new
blocks using a Least-Recently-Used algorithm . Set the size of the BlockCache to satisfy your
read requirements.
The size
cached objects for a given read may be significantly larger than the actual result of the
read. For instance, if HBase needs to scan through 20 HFile blocks to return a 100 byte
result, and the HFile blocksize is 100 KB, the read will add 20 * 100
KB to the LruBlockCache
.
LruBlockCache
resides entirely within the Java heap, the
amount of which is available to HBase and what percentage of the heap is available to the
LruBlockCache
strongly impact performance. By default, the amount of
HBase heap reserved for LruBlockCache
(hfile.block.cache.size
) is .40
, or 40%. To determine
the amount of heap available for the LruBlockCache
, use the following
formula. The 0.99
factor allows 1% of heap to be available as a "working
area" for evicting items from the cache. If you use the BucketCache, the on-heap
LruBlockCache only stores indexes and Bloom filters, and data blocks are cached in the
off-heap
BucketCache.number of RegionServers * heap size * hfile.block.cache.size * 0.99
To tune the size of the LruBlockCache
, you can add
RegionServers or increase the total Java heap on a given RegionServer to
increase it, or you can tune hfile.block.cache.size
to
reduce it. Reducing it will cause cache evictions to happen more often,
but will reduce the time it takes to perform a cycle of garbage
collection. Increasing the heap will cause garbage collection to take
longer but happen less frequently.