What's New in Apache Kafka
Learn about the new features of Apache Kafka in Cloudera Runtime 7.2.18.
Rebase on Kafka 3.4.1
Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 3.4.1. For more information, see the following upstream resources:
Kafka log directory monitoring improvements
A new Cloudera Manager chart, trigger, and action is added for the Kafka service. These assist you in monitoring the log directory space of the Kafka Brokers, and enable you to prevent Kafka disks from filling up.
The chart is called Log Directory Free Capacity. It shows the capacity of each Kafka Broker log directory.
The trigger is called Broker Log Directory Free Capacity Check. It is triggered if the capacity of any log directory falls below 10%. The trigger is automatically created for all newly deployed Kafka services, but must be created with the Create Kafka Log Directory Free Capacity Check action for existing services following an upgrade.
The chart and trigger are available on the
page. The action is available in .Kafka is safely stopped during operating system upgrades
During OS upgrades, Cloudera Manager now ensures that Kafka brokers are safely stopped. Specifically, Cloudera Manager now performs a rolling restart check before stopping a broker. This ensures that the Kafka service stays healthy during the upgrade. The level of health guarantee that Cloudera Manager ensures is determined by the restart check type set in the Cluster Health Guarantee During Rolling Restart Kafka property. Cloudera recommends that you set this property to all partitions stay healthy to avoid service outages. For more information, see Rolling restart checks.
useSubjectCredsOnly set to true by default in Kafka Connect
In previous versions, the javax.security.auth.useSubjectCredsOnly
JVM
property was set to false
in Kafka Connect. Because of this, connectors
running with an invalid or no JAAS configuration could use the credentials of other
connectors to establish connections. Starting with this release,
useSubjectCredsOnly
is set to true
by default. As a
result, connectors are required to use their own credentials.
This default change is true for newly provisioned clusters. On upgraded clusters,
useSubjectCredsOnly
remains set to false to ensure backwards
compatibility. If you are migrating connectors from a cluster running a previous version of
Runtime to a new cluster running 7.2.18 or later, you must ensure that credentials are added
to the connector configuration when migrated. Otherwise, migrated connectors may not work on
the new cluster.
In addition to the default value change, a new Kafka Connect property is introduced in
Cloudera Manager that you can use to set useSubjectCredsOnly
. The property
is called Add Use Subject Credentials Only JVM Option With True
Value. Setting this property to false does not expressly set
useSubjectCredsOnly
to false. Instead, it sets
useSubjectCredsOnly
to the cluster default value.
Kafka Connect metrics reporter security configurable in Cloudera Manager
- Secure Jetty Metrics Port
- Enable Basic Authentication for Metrics Reporter
- Jetty Metrics User Name
- Jetty Metrics Password
As a result of these changes, the setup steps required to configure Prometheus as the metrics store for SMM are changed. For updated deployment instructions, see Setting up Prometheus for Streams Messaging Manager.
Kafka load balancer is automatically configured with the LDAP handler if LDAP authentication is configured
When a load balancer and LDAP authentication is configured for Kafka, the PLAIN mechanism is automatically added to the enabled authentication mechanisms of the load balancer listener. Additionally, the load balancer is automatically configured to use LdapPlainServerCallbackHandler as the callback handler.
Kafka Connect now supports Kerberos auth-to-local (ATL) rules with SPNEGO authentication
Kafka Connect now uses the cluster-wide Kerberos auth-to-local (ATL) rules by default. A
new configuration property called Kafka Connect SPNEGO Auth To Local
Rules is introduced. This property is used to manually specify the ATL rules.
During an upgrade, the property is set to DEFAULT
to ensure backward
compatibility. Following an upgrade, if you want to use the cluster-wide rules, clear the
existing value from the Kafka Connect SPNEGO Auth To Local Rules
property.
Debezium connector version update
All Debezium connectors shipped with Cloudera Runtime are upgraded to version 1.9.7. Existing instances of the connectors are automatically upgraded to the new version during cluster upgrade. Deploying the previously shipped version of the connector is not possible. For more information see Kafka Connectors in Runtime or the Debezium documentation.
Persistent MQTT sessions support for the MQTT Source connector
Version 1.1.0 of the MQTT Source connector is released. The connector now supports MQTT persistent sessions. This enables the connector to resume (persist) a previous session with an MQTT broker after a session is interrupted. Enabling this feature can ensure that no messages are lost if the connector is momentarily stopped or if the network connection is interrupted.
To support persistent sessions, the following new properties are introduced:
- MQTT Client ID
This property specifies the MQTT client ID that the connector uses.
- MQTT Clean Session
This property controls whether the connector should start clean or persistent sessions. Set this property to false to enable persistent sessions.
Existing connectors will continue to function, upgrading them, however, is not possible. If you want to use the new version of the connector, you must deploy a new instance of the connector. For more information, see MQTT Source connector and MQTT Source properties reference.
Parquet support for the S3 Sink connector
- A new property, Parquet Compression Type, is added.
This property specifies the compression type used for writing Parquet files. Accepted values are
UNCOMPRESSED
,SNAPPY
,GZIP
,LZO
,BROTLI
,LZ4
, andZSTD
. - The Output File Data Format property now accepts
Parquet
as a value.
For more information, see S3 Sink connector and S3 Sink properties reference .
Support schema ID encoding in the payload or message header in Stateless NiFi connectors
The Kafka Connect connectors powered by Stateless NiFi that support record processing are updated to support content-encoded schema references for Avro messages. These connectors now properly support integration with Schema Registry and SMM.
- A new value, HWX Content-Encoded Schema Reference, is introduced for the Schema Access Strategy property
- If this value is set, the schema is read from Schema Registry, and the connector
expects that the Avro messages contain a content-encoded schema reference. That is,
the message contains a schema reference that is encoded in the message content. The
new value is introduced for the following connectors:
- ADLS Sink
- HDFS Sink
- HTTP Sink
- Influx DB Sink
- JDBC Sink
- JDBC Source
- Kudu Sink
- S3 Sink
- The Schema Write Strategy property is removed from the following connectors
-
- ADLS Sink
- HDFS Sink
- S3 Sink
- InfluxDB Sink
- A new property, Avro Schema Write Strategy is introduced
- This property specifies whether and how the record schema is attached to the output
data file when the format of the output is Avro. The property supports the following
values:
- Do Not Write Schema: neither the schema nor reference to the schema is attached to the output Avro messages.
- Embed Avro Schema: the schema is embedded in every output Avro message.
- HWX Content-Encoded Schema Reference: a reference to the schema (identified by Schema Name) within Schema Registry is encoded in the content of the outgoing Avro messages.
This property is introduced for the following connectors:
- ADLS Sink
- HDFS Sink
- S3 Sink
- SFTP Source
- Syslog TCP Source
- Syslog UDP Source
- The minor or major version of all affected connectors is updated
- Existing connectors will continue to function, upgrading them, however, is not possible. If you want to use the new version of the connector, you must deploy a new instance of the connector.
For more information, see the documentation for each connector in Kafka Connectors in Runtime and Streams Messaging Reference.