Kafka Tuning: Handling Large Messages
Before configuring Kafka to handle large messages, first consider the following options to reduce message size:
- The Kafka producer can compress messages. For example, if the original message is a text-based format (such as XML), in most cases the compressed message will be sufficiently small.
- Use the compression.type producer configuration parameters to enable compression. gzip, lz4 and Snappy are supported.
- If shared storage (such as NAS, HDFS, or S3) is available, consider placing large files on the shared storage and using Kafka to send a message with the file location. In many cases, this can be much faster than using Kafka to send the large file itself.
- Split large messages into 1 KB segments with the producing client, using partition keys to ensure that all segments are sent to the same Kafka partition in the correct order. The consuming client can then reconstruct the original large message.
If you still need to send large messages with Kafka, modify the configuration parameters in the following sections to match your requirements.
Property | Default Value | Description |
---|---|---|
message.max.bytes | 1000000
(1 MB) |
Maximum message size the broker accepts. Must be smaller than the consumer fetch.message.max.bytes, or the consumer cannot consume the message. |
log.segment.bytes | 1073741824
(1 GiB) |
Size of a Kafka data file. Must be larger than any single message. |
replica.fetch.max.bytes | 1048576
(1 MiB) |
Maximum message size a broker can replicate. Must be larger than message.max.bytes, or a broker can accept messages it cannot replicate, potentially resulting in data loss. |
If a single message batch is larger than any of the default values below, the consumer is still be able to consume the batch, but the batch is sent alone, which can cause performance degradation.
Property | Default Value | Description |
---|---|---|
max.partition.fetch.bytes | 1048576
(10 MiB) |
The maximum amount of data per-partition the server will return. |
fetch.max.bytes | 52428800
(50 MiB) |
The maximum amount of data the server should return for a fetch request. |
fetch.message.max.bytes | 1048576
(1 MiB) |
Maximum message size a consumer can read. Must be at least as large as message.max.bytes. |