Virtual memory handling

Learn about how virtual memory handling affects Kafka performance.

Kafka uses system page cache extensively for producing and consuming the messages. The Linux kernel parameter, vm.swappiness, is a value from 0-100 that controls the swapping of application data (as anonymous pages) from physical memory to virtual memory on disk. The higher the value, the more aggressively inactive processes are swapped out from physical memory. The lower the value, the less they are swapped, forcing filesystem buffers to be emptied. It is an important kernel parameter for Kafka because the more memory allocated to the swap space, the less memory can be allocated to the page cache. Cloudera recommends to set vm.swappiness value to 1.

  • To check memory swapped to disk, run vmstat and look for the swap columns.

Kafka heavily relies on disk I/O performance. vm.dirty_ratio and vm.dirty_background_ratio are kernel parameters that control how often dirty pages are flushed to disk. Higher vm.dirty_ratio results in less frequent flushes to disk.

  • To display the actual number of dirty pages in the system, run egrep "dirty|writeback" /proc/vmstat