What's New in Streams Messaging
Learn about the new Streams Messaging features in Cloudera DataFlow for Data Hub 7.3.2.
Cloudera DataFlow for Data Hub 7.3.2 introduces new Streams Messaging features and includes all service packs and cumulative hotfixes from Cloudera Runtime 7.3.1.100 through 7.3.1.706. For a comprehensive record of all Streams Messaging updates in Cloudera Runtime 7.3.1.x, see New Features.
What's New in Apache Kafka
New features and functional updates for Kafka are introduced in Cloudera DataFlow for Data Hub 7.3.2, its service packs, and cumulative hotfixes.
7.3.2
- Rebase on Kafka 3.9
-
Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 3.9.1 (previously 3.4.1). For more information, see the following resources:
-
Notable changes for releases 3.5.0 through 3.9.1: Upgrading | Apache Kafka.
-
The Apache Kafka release notes for the following versions:
- The Apache Kafka release announcements: Release Announcements | Apache Kafka
-
- KRaft is generally available and ZooKeeper is deprecated
- KRaft (Kafka Raft) is generally available. KRaft is from now on the recommended
metadata management mode for Kafka in Cloudera. Additionally, migrating existing
ZooKeeper-based Kafka clusters to use KRaft is now possible.
With the general availability of KRaft, deploying new or using existing Kafka clusters running in ZooKeeper mode is deprecated. Additionally, support for ZooKeeper-based Kafka clusters will be removed in a future release.
Cloudera recommends the following:
-
Deploy all new Kafka clusters in KRaft mode.
-
Migrate existing ZooKeeper-based clusters to KRaft following an upgrade to Cloudera Runtime 7.3.2.
This is the only version where migration is possible. Neither previous or future major, minor, and maintenance versions support migration.
For additional information, see the following resources: -
- Kafka protocol and metadata version is set automatically during upgrades
- When upgrading Kafka, Cloudera Manager now automatically sets the
inter.broker.protocol.versionproperty for ZooKeeper-based clusters and themetadata.versionproperty for KRaft-based clusters. You no longer need to manually set these properties to the current protocol or metadata version before an upgrade. This feature is only available when upgrading to Cloudera Runtime 7.3.2 or higher.After the upgrade, clearing these properties remains a manual task. However, in Cloudera Runtime 7.3.2 and higher, both
inter.broker.protocol.versionandmetadata.versionare now available for direct configuration in . The label names of the properties are Kafka Inter-Broker Protocol Version and Kafka Metadata Version. This means you can set or clear these properties directly from the UI, without needing to use advanced configuration snippets. - Connector-level offset flush control
- A new connector-level property,
cloudera.offset.flush.interval.ms, is added. Use this property to override the Kafka Connect role-level Offset Flush Interval (offset.flush.interval.ms) property. Overriding enables you to control the interval at which connector task offsets are committed on a per-connector basis.Configure
cloudera.offset.flush.interval.msin connectors that need a different offset flush interval than the role default. This is commonly useful for connectors where the interval controls how often data is flushed to target systems, for example NiFiStatelessSink, HDFSSink, and S3Sink. - IPv6 support for Kafka
-
Starting with the 7.3.2 release, Kafka supports IPv6 with dual-stack functionality, allowing seamless communication over both IPv4 and IPv6 networks. This capability improves network scalability, future-proofs deployments, and enhances overall platform security.
- Offline Log Directories chart
- A new default chart, Offline Log Directories, is added for Kafka
in Cloudera Manager. This chart can help you quickly identify and
track storage issues on your brokers. It is available by default for the Kafka service
as well as for individual Kafka Broker role instances.
The chart shows offline log directories and their mount paths for Kafka brokers. A non-zero value indicates an active error state for a specific log directory, while a value of 0 means the directory was in an error state during the selected timeframe but is now healthy. The chart only displays log directories that had errors during the selected timeframe.
- New actions for collecting Kafka diagnostic data
-
The following new service-specific actions are available for collecting Kafka diagnostic data in Cloudera Manager:
-
Collect Kafka Cluster Diagnostics - gathers detailed cluster-wide data, including topics, configurations, consumer groups, and more.
-
Describe Kafka Topics - provides detailed information about all Kafka topics.
These actions are available in the Actions dropdown on the Kafka service and Kafka Broker role instance pages. Diagnostic data is printed to
sdtoutfor immediate access and also saved as a compressed archive on the host where the action runs.For more information, seeCollecting Kafka diagnostic data using Cloudera Manager actions Connect.
-
- Debezium connectors upgraded from
1.9.8.Finalto3.3.1.Final -
This release of Cloudera Runtime ships version
3.3.1.Finalof the following Debezium connectors:- MySQL
- PostgreSQL
- Oracle
- SQL Server
- Db2
Existing connector instances are automatically upgraded to the new version as part of a cluster upgrade. However, you will be required to make configuration updates before you can upgrade your cluster. Critical changes that affect all Debezium connectors are summarized below.
- Property renaming (configuration namespace changes)
New, more consistent namespaces for configuration properties are introduced. The old database.* prefixes have been removed. Connector configuration keys collected in the following table must be updated before an upgrade.
Old Property Prefix (Debezium 1.9) New Property Prefix (Debezium 3.3) database.server.nametopic.prefixdatabase.history.*schema.history.internal.*database.*(JDBC pass-through)driver.*database.dbname(SQL Server)database.names - Database driver version requirements are updated
The recommended and supported JDBC driver versions used by the majority of connectors has changed. The following table collects the JDBC drivers you will need to deploy on your cluster before an upgrade.
Component New Driver Version / Notes MySQL 9.1.0PostgreSQL 42.7.7Oracle 21.x,23.x— use a Java 11+ Oracle JDBC driver (ojdbc11.jar)SQL Server 12.4.2.jre8Db2 11.5.0.0
For more information, see Getting started with upgrades for Cloudera on cloud.
What's New in Schema Registry
New features and functional updates for Schema Registry are introduced in Cloudera DataFlow for Data Hub 7.3.2, its service packs, and cumulative hotfixes.
7.3.2
There are no new features in this release.
What's New in Streams Messaging Manager
New features and functional updates for Streams Messaging Manager are introduced in Cloudera DataFlow for Data Hub 7.3.2, its service packs, and cumulative hotfixes.
7.3.2
There are no new features in this release.
What's New in Streams Replication Manager
New features and functional updates for Streams Replication Manager are introduced in Cloudera DataFlow for Data Hub 7.3.2, its service packs, and cumulative hotfixes.
7.3.2
- Reverse Checkpointing
-
Streams Replication Manager now supports reverse checkpointing. This feature enables the tracking and replication of consumer offsets from a target cluster back to a source cluster. By tracking offsets in the reverse direction, you ensure that the progress made by consumer groups on a backup cluster is preserved and translated back to the primary cluster during a failback scenario.
Reverse checkpointing minimizes message duplication upon failback by mapping the offsets from the replica topic back to the equivalent offsets in the source topic. To enable this feature, you must configure the following in Cloudera Manager:
- Set the
cloudera.reverse.checkpointing.enabledproperty totrue. - Enable bidirectional replication in the Streams Replication Manager's Replication Configs property.
In addition to service configurations, you must use the
srm-controltool to explicitly allowlist topics for reverse checkpointing using thereverse-checkpointed-topicscommand. Consumer group replication must also be enabled in both directions. - Set the
- Single REST server for all replication flows
-
Streams Replication Manager now uses a single REST server with a single port to handle inter-worker communication for all replication flows. Previously, a dedicated REST server was started for each replication flow. The new implementation exposes only the endpoints required for inter-worker coordination and task configuration updates. These endpoints are restricted to inter-worker communication and cannot be accessed externally. The legacy per-flow REST server implementation is deprecated in 7.3.2 and will be removed in a future release. Cloudera recommends that you migrate your Streams Replication Manager clusters to the new implementation.
- Suppressing internal metrics topics
-
You can now configure the Streams Replication Manager Service to suppress the eager creation of
srm-metricstopics for all possible replication flows. This prevents the creation of unused topics. To enable this behavior, set themetrics.topic.creation.for.possible.flows.enabledproperty tofalse. - Configurable timeout for Streams Application Kafka Connection Health Test
-
A new SRM Service Streams Application Connection Test Timeout (
streams.replication.manager.service.streams.application.connection.test.timeout) Cloudera Manager configuration option is now available for the Streams Replication Manager Service. It sets the timeout, in seconds, for the Streams Application Kafka Connection Health Test, which periodically checks connectivity to the target Kafka cluster. The default is1second.
What's New in Cruise Control
New features and functional updates for Cruise Control are introduced in Cloudera DataFlow for Data Hub 7.3.2, its service packs, and cumulative hotfixes.
7.3.2
- New configuration parameter for controlling IP stack preference
-
A new
cc.additional.java.optionsconfiguration parameter is available on the Cruise Control configuration page in Cloudera Manager. The default value sets the IP protocol to IPv4. - New
intra.broker.goalsconfiguration for Cruise Control -
Cloudera Manager introduces a new
intra.broker.goalsconfiguration for Cruise Control. The default value includescom.linkedin.kafka.cruisecontrol.analyzer.goals.IntraBrokerDiskCapacityGoalandcom.linkedin.kafka.cruisecontrol.analyzer.goals.IntraBrokerDiskUsageDistributionGoal.This has an effect on the existing Default Goals (
default.goals) configuration, which must be a subset of Supported Goals and Supported Intra Broker Goals.Additionally, the
intra.broker.goalsconfiguration no longer needs to be defined in an advanced configuration snippet if done previously.
