Chapter 2. What's New
New features and changes for Apache Kafka have been introduced in Hortonworks Data Platform, version 2.5, along with documentation updates. New features and documentation updates are described in the following sections.
Apache Kafka
HDP 2.5 supports Apache Kafka version 0.10.0. Important new features include the following:
Fault tolerance
- Balancing replicas across racks
This rack awareness feature limits the risk of data loss if all brokers on a rack fail at once. The feature distributes replicas of a partition across different racks, extending guarantees that Kafka provides for broker failures so that they now cover rack failure. For more information, see Balancing Replicas Across Racks at apache.org.
Security
- Security/SASL improvements
Kafka supports authentication using SASL/PLAIN. For more information, see Apache JIRA KAFKA-2658.
Application Development
- New client-side event interceptors
Two new plugin interfaces,
ProducerInterceptor
on producer andConsumerInterceptor
on consumer, allow developers to implement and configure custom interceptors.The ProducerInterceptor interface allows processes to intercept events happening to a producer record, such as sending the producer record or receiving an acknowledgment when a record is published. For more information, see the ProducerInterceptor javadoc.
The ConsumerInterceptor interface allows processes to intercept consumer events, such as record being received or a record being consumed by a client. For more information, see the ConsumerInterceptor javadoc.
For more information, see Add Producer and Consumer Interceptors at apache.org.
- New Kafka Streams client library
The Kafka Streams API allows developers to implement distributed stream processing applications that consume from and produce data to Kafka topics.
Note that the Kafka Streams API is a technical preview; the code is considered to be at alpha quality level. Public APIs are likely to change in future releases.
For more information, see Streams API at apache.org.
- New timestamp field for messages
Messages are now tagged with timestamps when they are produced. For more information, see Apache JIRA KAFKA-3025.
- New configuration parameter
max.poll.records
max.poll.records
is a Kafka Consumer parameter that allows developers to limit the number of messages returned in a single call topoll()
. For more information, see Apache JIRA KAFKA-3007
For detailed information about new features in Kafka version 0.10.0, see the Apache Release Notes for Kafka 0.10.0.
Content Updates
Added detailed instructions for installing Kafka on an Ambari-managed cluster; see Installing Kafka using Ambari.