Block cache size

Kudu uses an LRU cache for recently read data. On workloads that scan a subset of the data repeatedly, raising the size of this cache can offer significant performance benefits. To increase the amount of memory dedicated to the block cache, increase the value of the --block_cache_capacity_mb flag. The default is 512 MiB.

Kudu provides a set of useful metrics for evaluating the performance of the block cache, which can be found on the /metrics endpoint of the Web UI. The following is an example set:

{
  "name": "block_cache_inserts",
  "value": 64
},
{
  "name": "block_cache_lookups",
  "value": 512
},
{
  "name": "block_cache_evictions",
  "value": 0
},
{
  "name": "block_cache_misses",
  "value": 96
},
{
  "name": "block_cache_misses_caching",
  "value": 64
},
{
  "name": "block_cache_hits",
  "value": 0
},
{
  "name": "block_cache_hits_caching",
  "value": 352
},
{
  "name": "block_cache_usage",
  "value": 6976
}

To judge the efficiency of the block cache on a tablet server, first wait until the server has been running and serving normal requests for some time, so the cache is not cold. Unless the server stores very little data or is idle, block_cache_usage should be equal or nearly equal to block_cache_capacity_mb. Once the cache has reached steady state, compare block_cache_lookups to block_cache_misses_caching. The latter metric counts the number of blocks that Kudu expected to read from cache but which weren’t found in the cache. If a significant amount of lookups result in misses on expected cache hits, and theblock_cache_evictions metric is significant compared to block_cache_inserts, then raising the size of the block cache may provide a performance boost. However, the utility of the block cache is highly dependent on workload, so it’s necessary to test the benefits of a larger block cache.