What's New in Apache Kafka

Learn about the new features of Kafka in Cloudera Runtime 7.1.8.

Rebase on Kafka 3.1.1

Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 3.1.1. For more information, see the following upstream resources:

Apache Kafka Notable Changes:
Apache Kafka Release Notes:

Kerberos principal used by MirrorMaker is configurable

The Kerberos principal used by the MirrorMaker role is configurable. Configuration can be done using the Role-Specific Kerberos Principal (kerberos_role_princ_name) property. In addition, on newly installed clusters, the default principal (kafka_mirror_maker) is automatically granted correct access rights in Ranger.

Kafka broker rolling restart checks

Cloudera Manager can now be configured to perform different types of checks on the Kafka brokers during a rolling restart. Using these checks can ensure that the brokers remain healthy during and after a rolling restart. As a result of this change, Kafka rolling restarts may take longer than in previous versions. This is true even if you disable the rolling restart checks. For more information, see Rolling restart checks.

Http Metrics Report Exclude Filter introduced for Kafka

A new property, Http Metrics Report Exclude Filter (kafka.http.metrics.reporter.exclude.filter), is introduced for the Kafka service. This property can be used to specify a regular expression that is used to filter metrics. Any metric matching the specified regular expression is not reported by Cloudera Manager. As a result, these metrics are also not displayed in SMM. Use JMX metric names when configuring this property.

Enable JMX Authentication by default

JMX Authentication is now enabled by default for the Kafka service. Randomly generated passwords are now set for both the JMX monitor (read only access) and control (read and write access) users. The default passwords can be changed at any time using the Password of User with read-only Access to the JMX agent and the Password of user with read-write access to the JMX agent Kafka service properties. Additionally, JMX authentication can be turned off using the Enable Authenticated Communication with the JMX Agent property.

OAuth2 authentication available for Kafka

Oauth2 authentication support is added for the Kafka service. You can now configure Kafka brokers to authenticate clients using Oauth2. For more information, see OAuth2 authentication.

HSTS header is included by default in Kafka Connect REST API responses

Kafka Connect REST API responses now include the HSTS header by default.

Kafka load balancer support

The Kafka service can now be provided with a host of a load balancer that is used to balance connection bootstraps between multiple brokers. The host can be configured using the Kafka Broker Load Balancer Host property. Additionally, if a host is configured, the Kafka service configures a listener for accepting requests from the load balancer. This port is customizable using the Kafka Broker Load Balancer Listener Port property. Using these properties configures your Kafka service in a way that clients can connect to the brokers without encountering ticket mismatch issues in Kerberized environments or TLS/SSL hostname verification failures.

Importing Kafka entities into Atlas

Kafka topics and clients can now be imported into Atlas as entities (metadata) using a new action available for the Kafka service in Cloudera Manager. The new action is available at Kafka service> > Actions > Import Kafka Topics Into Atlas. The action serves as a replacement/alternative for the kafka-import.sh tool. For more information, see Importing Kafka entities into Atlas.

Kafka data directories are now only readable by Kafka

Previously, Kafka data directories were configured as world-readable on the filesystem of the broker host. From now on, they are only readable by Kafka. Although Kafka brokers are usually locked down, and as a result world-readable directories should not pose a security risk, this change applies more restrictive access control on the data directories of the Kafka brokers, hardening their security.

Collect ControlPlaneNetworkProcessorAvgIdlePercent metric

The ControlPlaneNetworkProcessorAvgIdlePercent Kafka metric is now collected by Cloudera Manager and is available as control_plane_network_processor_avg_idle.

Set sasl.kerberos.principal.to.local.rules based on HDFS auth_to_local

The Kafka service now uses the hadoop.security.auth_to_local settings from hadoop's core_site.xml to automatically populate the sasl.kerberos.principal.to.local.rules Kafka broker property. The value can still be overridden in Kafka using the Kafka Broker Advanced Configuration Snippet (Safety Valve) for kafka.properties property.

Bootstrap servers are automatically configured for Kafka Connect

The Bootstrap Servers property of the Kafka Connect role is now automatically configured to include the bootstrap servers of its co-located Kafka brokers. This is only done if the property is left empty (default). You can provide custom value for this property if you want to override the default host:port pairs that Kafka Connect uses when it establishes a connection with the Kafka brokers.

Stateless NiFi Source and Sink

The Stateless NiFi Source and Sink connectors enable you to run NiFi dataflows within Kafka Connect. Using these connectors can grant you access to a number of NiFi features without having the need to deploy or maintain NiFi on your cluster. For more information on the connectors, best practices on building dataflows to use with these connectors, as well as information on how to deploy the connectors, see Stateless NiFi Source and Sink.

New Cloudera developed Kafka Connect connectors

In addition to the introduction of the Stateless NiFi Source and Sink, 14 new Cloudera developed connectors are available for use with Kafka Connect. These are powered by the Stateless NiFi engine and run Cloudera developed dataflows. They provide an out-of-the box solution for some of the most common use cases for moving data in or out of Kafka. For more information, see Connectors in the Kafka Connect documentation.

Debezium Connector support

The following change data capture (CDC) connectors are added to Kafka Connect:

  • Debezium MySQL Source
  • Debezium Postgres Source
  • Debezium SQL Server Source
  • Debezium Oracle Source

Each of the connectors require CDP specific steps before they can be deployed. For more information, see Connectors.

Secure Kafka Connect

A number of significant improvements and new features are introduced related to Kafka Connect security. These are as follows:

SPNEGO authentication for the Kafka Connect REST API
You can secure the Kafka Connect REST API by enabling SPNEGO authentication. If SPNEGO authentication is enabled, only users authenticated with Kerberos are able to access and use the REST API. Additionally, if Ranger authorization is enabled for the Kafka service, authenticated users will only be able perform the operations that they are authorized for. For more information, see Configuring SPNEGO Authentication and trusted proxies for the Kafka Connect REST API.
Kafka Connect Authorization model
An authorization model is introduced for Kafka Connect. Implementations are pluggable and it is up to the implementation how the capabilities of the model are utilized. The authorization model is implemented by default in Ranger. For more information about the model, see Kafka Connect authorization model. For more information about the Ranger integration of the model, see Kafka Connect Ranger integration.
Kafka Connect connector configurations can now be secured
A new feature called Kafka Connect Secrets Storage is introduced. This feature enables you to mark properties within connector configurations as a secret. If a property is marked as a secret, the feature stores and handles the value of that property in a secure manner. For more information, see Kafka Connect Secrets Storage.
Kafka Connect Connectors can be configured to override the JAAS, and restrict the usage of the Worker principal
Kafka Connect now allows users to force Connectors to override the JAAS configuration of the Kafka connection, and also forbid using the same Kerberos credentials as the Connect worker is using. For more information, see Configuring connector JAAS configuration and Kerberos principal overrides
Nexus allow list for Stateless NiFi Source and Sink connectors
A new configuration property, List Of Allowed Nexus Repository Urls, is introduced for the Kafka service. This property enables you to specify a list of allowed Nexus repositories that Kafka Connect connectors are allowed to connect to when fetching NiFi extensions. Configuring an allow list using the property can harden the security Kafka Connect deployment. For more information, see Configuring a Nexus repository allow list.