Apache Kafka Component Guide
Also available as:
PDF

Preparing the Environment

The following factors can affect Kafka performance:

  • Operating system settings

  • File system selection

  • Disk drive configuration

  • Java version

  • Ethernet bandwidth

Operating System Settings

Consider the following when configuring Kafka:

  • Kafka uses page cache memory as a buffer for active writers and readers, so after you specify JVM size (using -Xmx and -Xms Java options), leave the remaining RAM available to the operating system for page caching.

  • Kafka needs open file descriptors for files and network connections. You should set the file descriptor limit to at least 128000.

  • You can increase the maximum socket buffer size to enable high-performance data transfer.

File System Selection

Kafka uses regular Linux disk files for storage. We recommend using the EXT4 or XFS file system. Improvements to the XFS file system show improved performance characteristics for Kafka workloads without compromising stability.

[Caution]Caution
  • Do not use mounted shared drives or any network file systems with Kafka, due to the risk of index failures and (in the case of network file systems) issues related to the use of MemoryMapped files to store the offset index.

  • Encrypted file systems such as SafenetFS are not supported for Kafka. Index file corruption can occur.

Disk Drive Considerations

For throughput, we recommend dedicating multiple drives to Kafka data. More drives typically perform better with Kafka than fewer. Do not share these Kafka drives with any other application or use them for Kafka application logs.

You can configure multiple drives by specifying a comma-separated list of directories for the log.dirs property in the server.properties file. Kafka uses a round-robin approach to assign partitions to directories specified in log.dirs; the default value is /tmp/kafka-logs.

The num.io.threads property should be set to a value equal to or greater than the number of disks dedicated for Kafka. Recommendation: start by setting this property equal to the number of disks.

Depending on how you configure flush behavior (see "Log Flush Management"), a faster disk drive is beneficial if the log.flush.interval.messages property is set to flush the log file after every 100,000 messages (approximately).

Kafka performs best when data access loads are balanced among partitions, leading to balanced loads across disk drives. In addition, data distribution across disks is important. If one disk becomes full and other disks have available space, this can cause performance issues. To avoid slowdowns or interruptions to Kafka services, you should create usage alerts that notify you when available disk space is low.

RAID can potentially improve load balancing among the disks, but RAID can cause performance bottleneck due to slower writes. In addition, it reduces available disk space. Although RAID can tolerate disk failures, rebuilding RAID array is I/O-intensive and effectively disables the server. Therefore, RAID does not provide substantial improvements in availability.

Java Version

With Apache Kafka on HDP 2.5, you should use the latest update for Java version 1.8 and make sure that G1 garbage collection support is enabled. (G1 support is enabled by default in recent versions of Java.) If you prefer to use Java 1.7, make sure that you use update u51 or later.

Here are several recommended settings for the JVM:

-Xmx6g 
-Xms6g 
-XX:MetaspaceSize=96m 
-XX:+UseG1GC
-XX:MaxGCPauseMillis=20 
-XX:InitiatingHeapOccupancyPercent=35 
-XX:G1HeapRegionSize=16M
-XX:MinMetaspaceFreeRatio=50 
-XX:MaxMetaspaceFreeRatio=80

To set JVM heap size for the Kafka broker, export KAFKA_HEAP_OPTS; for example:

export KAFKA_HEAP_OPTS="-Xmx2g -Xms2g"
./kafka-server-start.sh

Ethernet Bandwidth

Ethernet bandwidth can have an impact on Kafka performance; make sure it is sufficient for your throughput requirements.