What's new in Streams Messaging

Learn about the new Streams Messaging features in Cloudera DataFlow for Data Hub 7.2.17.

Kafka

Rebase on Kafka 3.4.0
Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 3.4.0. For more information, see the following upstream resources:
Apache Kafka Notable Changes:
Apache Kafka Release Notes:
Kafka KRaft [TECHNICAL PREVIEW]
Apache Kafka Raft (KRaft) is a consensus protocol used for metadata management that was developed as a replacement for Apache ZooKeeper. Using KRaft for managing Kafka metadata instead of ZooKeeper offers various benefits including a simplified architecture and a reduced operational footprint.
Kafka KRaft in this release of Cloudera Runtime is in technical preview and does not support the following:
  • Deployments with multiple log directories. This includes deployments that use JBOD for storage.
  • Delegation token based authentication.
  • Migrating an already running Kafka service from ZooKeeper to KRaft.
  • Atlas Integration.

For a conceptual overview on KRaft, see Kafka KRaft. For more information on how to deploy a Streams Messaging Data Hub cluster that is running KRaft mode, see Setting up your Streams Messaging cluster.

SMT plugins for binary conversion
Two Cloudera developed Single Message Transforms (SMT) plugins are added. These are the ConvertToBytes and ConvertFromBytes plugins, which you can use to convert binary data to or from the Kafka Connect internal data format.
For more information, see the following resources:
EOS for source connectors
Exactly-once semantics (EOS) support is added for Kafka Connect source connectors. For more information, see Configuring EOS for source connectors.
Rolling restart checks provide a high cluster health guarantees by default

The default value of the Cluster Health Guarantee During Rolling Restart property is changed from none to healthy partitions stay healthy. This property defines what type of checks are performed during a Rolling Restart on the restarted broker. Each setting guarantees a different level of cluster health during Rolling Restarts. With the none setting, no checks are performed. This means that in previous versions no guarantees were provided on cluster health by default.

The new default, healthy partitions stay healthy, ensures a high level of guarantees on cluster health. This setting ensures that no partitions go into an under-min-isr state when a broker is stopped. This is achieved by waiting before each broker is stopped so that all other brokers can catch up with all replicas that are in an at-min-isr state. Additionally, the setting ensures that the restarted broker is accepting requests on its service port before restarting the next broker. This setting ignores partitions which are already in an under-min-isr state. For more information, see Configuring EOS for source connectors.

LDAPS SSL configurations are inherited from the Kafka broker
The SSL configurations of LDAP over SSL (LDAPS) are inherited from the Kafka broker. Previously, the JDK default was used. If the JDK default certificate store contains certificates which were used to setup SSL connection to LDAP, it should be imported to the broker stores.
Aliases for Kafka CLI tools
Aliases are added for the kafka-storage.sh, kafka-cluster.sh, and kafka-features.sh command line tools. These tools can now be called globally with kafka-storage, kafka-cluster, and kafka-features.

Schema Registry

KafkaAvroSerializer and KafkaAvroDeserializer improvements
KafkaAvroSerializer and KafkaAvroDeserializer can now handle null values without Avro
The KafkaAvroSerializer and KafkaAvroDeserializer now support a configuration property called null.passthrough.enabled, which is false by default. If enabled, null data is handled as null, and no schema is sent to Schema Registry. This behavior enables client applications to write tombstone messages into compact topics. The KafkaAvroDeserializer also handles null values by returning null without any regards to the schema.
Support deserialization when the topic and schema names don't match
From now on, the KafkaAvroDeserializer uses the schema version's ID in the Avro byte stream to access the actual schema and disregards schema names.
Logical types conversion for the KafkaAvroSerliazier and KafkaAvroDesrializer
The KafkaAvroSerializer and KafkaAvroDeserializer can now properly handle and convert Avro logical types at a record level. This means that if you have a record that has a field with a built-in Avro logical type (for example a BigDecimal field with BYTES type and decimal logical type), you can now properly serialize the records. After deserialization, a GenericRecord is returned, including the typed BigDecimal field, instead of a ByteBuffer. Logical type conversion can be enabled using the logical.type.conversion.enabled property. This property is set to false by default for backward compatibility.
Principal mapping rules can be defined without quotes
The SSL Client Authentication Mapping Rules (schema.registry.ssl.principal.mapping.rules) property now accepts rules that are defined without quotes. As a result, when adding multiple rules, you no longer need to enclose each rule in quotes.
Remove modules section from registry.yaml
In previous versions, the registry.yaml configuration file contained a modules section. This section was used to list pluggable modules that extended Schema Registry’s functionality. However, modules were never fully supported and have been removed in a previous release. The modules section in registry.yaml was kept for backwards compatibility. Starting with this version, the modules section is removed by default from registry.yaml.

Streams Messaging Manager

UI updates
The style of SMM UI is updated. This update includes various changes to the colors, fonts, and overall style of the UI. Additionally, the following functional changes and improvements are made:
Data Explorer
  • The modal window that you use to view messages now includes a copy to clipboard button if the message you are viewing is long.
  • A (Refresh) option is added next to the FROM OFFSET field. This option refreshes the partition offset range and fetches the latest messages.
Connector Configuration and Connector Settings pages
  • A new option, Add, is added to the Import a Connector config… modal. This option enables you to import connector configuration properties without overriding existing properties.
  • Property keys can now be filtered based on their group and importance.
  • A Reset Filters option is added, this option resets all search filters.
  • Three new actions are added that modify the configuration as a whole. The options are Remove all, Reset, and Export. These actions are available in a new Actions drop-down.
  • The Import Connector Configuration… option is moved to the Actions drop-down and is renamed to Import.
  • The Deployment Status modal now correctly displays the status of the deployment process.
  • An error message is added that notifies you if validation errors are found for properties that are currently filtered.
  • If available, the display names of configuration property keys are displayed above the property key.
Highly available Kafka Connect integration
SMM uses the Kafka Connect service role’s REST URL to establish a connection with Connect and serve Connect metrics. Previously, even if your Connect deployment was highly available and had multiple service roles deployed, SMM could only be configured with a single connection URL. From now on, multiple URLs can be configured. If the Connect service role that SMM is connected to fails, SMM automatically connects to a different instance that is available.

As a result of this change, the Kafka Connect Host and Kafka Connect Port properties are replaced by the Kafka Connect Rest HostPorts property. If Kafka Connect Rest HostPorts is left empty (default), SMM is automatically configured with the host, port, and protocol of the Connect service role instances belonging to the Kafka service selected with the Kafka Service SMM property.

If you previously configured Kafka Connect Host and Kafka Connect Port, the values set in the properties are automatically migrated to Kafka Connect Rest HostPorts when you upgrade.

The SMM API now hides email notifier SMTP passwords in its response

Previously the /notifiers endpoint returned the full configuration of the notifier. In the case of email notifiers, the configuration included the password of the SMTP server. API responses from now on do not include the password. As result of this change, the PASSWORD field of existing email notifiers is left blank when you edit them. If you decide to edit the notifier you must reenter the password.

Streams Replication Manager

Improved SRM logging

SRM’s logging capabilities are improved. From now on:

  • Kafka clients created by SRM’s internal connectors reference the replication flow they are a part of (KAFKA-14838 backport).
  • SRM now includes references to the replication flow in the log context of its internal connectors.

These changes enable differentiation between the logs associated with each replication flow.

Cruise Control

Rebasing Cruise Control to 2.5.116
Cruise Control in Cloudera Runtime is rebased to the 2.5.116 version. For more information about the fixes and features in Cruise Control 2.5.116, see the Cruise Control Rebase Summary.
New endpoint for Cruise Control
The GET/kafkacruisecontrol/permissions endpoint is added to Cruise Control that lists the level permissions of the current user. In case authentication is not configured for a user, the GET call returns Unable to retrieve privilege information for an unsecure connection message.