Apache Kafka Component Guide
Also available as:
PDF

Kafka Producer Settings

If performance is important and you have not yet upgraded to the new Kafka producer (client version 0.9.0.1 or later), consider doing so. The new producer is generally faster and more fully featured than the previous client.

To use the new producer client, add the associated maven dependency on the client jar; for example:

<dependency>
    <groupId>org.apache.kafka</groupId>
    <artifactId>kafka-clients</artifactId>
    <version>0.9.0.0</version>
</dependency>

For more information, see the KafkaProducer javadoc.

The following subsections describe several types of configuration settings that influence the performance of Kafka producers.

Important Producer Settings

The lifecycle of a request from producer to broker involves several configuration settings:

  1. The producer polls for a batch of messages from the batch queue, one batch per partition. A batch is ready when one of the following is true:

    • batch.size is reached. Note: Larger batches typically have better compression ratios and higher throughput, but they have higher latency.

    • linger.ms (time-based batching threshold) is reached. Note: There is no simple guideilne for setting linger.ms values; you should test settings on specific use cases. For small events (100 bytes or less), this setting does not appear to have much impact.

    • Another batch to the same broker is ready.

    • The producer calls flush() or close().

  2. The producer groups the batch based on the leader broker.

  3. The producer sends the grouped batch to the broker.

The following paragraphs list additional settings related to the request lifecycle:

max.in.flight.requests.per.connection (pipelining)

The maximum number of unacknowledged requests the client will send on a single connection before blocking. If this setting is greater than 1, pipelining is used when the producer sends the grouped batch to the broker. This improves throughput, but if there are failed sends there is a risk of out-of-order delivery due to retries (if retries are enabled). Note also that excessive pipelining reduces throughput.

compression.type

Compression is an important part of a producer’s work, and the speed of different compression types differs a lot.

To specify compression type, use the compression.type property. It accepts standard compression codecs ('gzip', 'snappy', 'lz4'), as well as 'uncompressed' (the default, equivalent to no compression), and 'producer' (uses the compression codec set by the producer).

Compression is handled by the user thread. If compression is slow it can help to add more threads. In addition, batching efficiency impacts the compression ratio: more batching leads to more efficient compression.

acks

The acks setting specifies acknowledgments that the producer requires the leader to receive before considering a request complete. This setting defines the durability level for the producer.

AcksThroughputLatencyDurability
0HighLow No Guarantee. The producer does not wait for acknowledgment from the server.
1MediumMediumLeader writes the record to its local log, and responds without awaiting full acknowledgment from all followers.
-1LowHighLeader waits for the full set of in-sync replicas (ISRs) to acknowledge the record. This guarantees that the record is not lost as long as at least one IRS is active.
flush()

The new Producer API supports an optional flush() call, which makes all buffered records immediately available to send (even if linger.ms is greater than 0).

When using flush(), the number of bytes between two flush() calls is an important factor for performance.

  • In microbenchmarking tests, a setting of approximately 4MB performed well for events 1KB in size.

  • A general guideline is to set batch.size equal to the total bytes between flush()calls divided by number of partitions:

    (total bytes between flush()calls) / (partition count)

Additional Considerations

A producer thread going to the same partition is faster than a producer thread that sends messages to multiple partitions.

If a producer reaches maximum throughput but there is spare CPU and network capacity on the server, additional producer processes can increase overall throughput.

Performance is sensitive to event size: larger events are more likely to have better throughput. In microbenchmarking tests, 1KB events streamed faster than 100-byte events.