What's new in Streams Messaging
Learn about the new Streams Messaging features in Cloudera DataFlow for Data Hub 7.2.17.
Kafka
- Rebase on Kafka 3.4.0
- Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 3.4.0.
For more information, see the following upstream resources:Apache Kafka Notable Changes:
- Kafka KRaft [TECHNICAL PREVIEW]
- Apache Kafka Raft (KRaft) is a consensus protocol used for metadata management that
was developed as a replacement for Apache ZooKeeper. Using KRaft for managing Kafka
metadata instead of ZooKeeper offers various benefits including a simplified
architecture and a reduced operational footprint. Kafka KRaft in this release of Cloudera Runtime is in technical preview and does not support the following:
- Deployments with multiple log directories. This includes deployments that use JBOD for storage.
- Delegation token based authentication.
- Migrating an already running Kafka service from ZooKeeper to KRaft.
- Atlas Integration.
For a conceptual overview on KRaft, see Kafka KRaft. For more information on how to deploy a Streams Messaging Data Hub cluster that is running KRaft mode, see Setting up your Streams Messaging cluster.
- SMT plugins for binary conversion
- Two Cloudera developed Single Message Transforms (SMT) plugins are added. These are
the
ConvertToBytes
andConvertFromBytes
plugins, which you can use to convert binary data to or from the Kafka Connect internal data format.For more information, see the following resources: - EOS for source connectors
- Exactly-once semantics (EOS) support is added for Kafka Connect source connectors. For more information, see Configuring EOS for source connectors.
- Rolling restart checks provide a high cluster health guarantees by default
-
The default value of the Cluster Health Guarantee During Rolling Restart property is changed from
none
tohealthy partitions stay healthy
. This property defines what type of checks are performed during a Rolling Restart on the restarted broker. Each setting guarantees a different level of cluster health during Rolling Restarts. With thenone
setting, no checks are performed. This means that in previous versions no guarantees were provided on cluster health by default.The new default,
healthy partitions stay healthy
, ensures a high level of guarantees on cluster health. This setting ensures that no partitions go into an under-min-isr state when a broker is stopped. This is achieved by waiting before each broker is stopped so that all other brokers can catch up with all replicas that are in an at-min-isr state. Additionally, the setting ensures that the restarted broker is accepting requests on its service port before restarting the next broker. This setting ignores partitions which are already in an under-min-isr state. For more information, see Configuring EOS for source connectors. - LDAPS SSL configurations are inherited from the Kafka broker
- The SSL configurations of LDAP over SSL (LDAPS) are inherited from the Kafka broker. Previously, the JDK default was used. If the JDK default certificate store contains certificates which were used to setup SSL connection to LDAP, it should be imported to the broker stores.
- Aliases for Kafka CLI tools
- Aliases are added for the
kafka-storage.sh
,kafka-cluster.sh
, andkafka-features.sh
command line tools. These tools can now be called globally withkafka-storage
,kafka-cluster
, andkafka-features
.
Schema Registry
- KafkaAvroSerializer and KafkaAvroDeserializer improvements
-
- KafkaAvroSerializer and KafkaAvroDeserializer can now handle null values without Avro
- The
KafkaAvroSerializer
andKafkaAvroDeserializer
now support a configuration property callednull.passthrough.enabled
, which is false by default. If enabled, null data is handled as null, and no schema is sent to Schema Registry. This behavior enables client applications to write tombstone messages into compact topics. TheKafkaAvroDeserializer
also handles null values by returning null without any regards to the schema.
- Support deserialization when the topic and schema names don't match
- From now on, the
KafkaAvroDeserializer
uses the schema version's ID in the Avro byte stream to access the actual schema and disregards schema names.
- Logical types conversion for the KafkaAvroSerliazier and KafkaAvroDesrializer
- The
KafkaAvroSerializer
andKafkaAvroDeserializer
can now properly handle and convert Avro logical types at a record level. This means that if you have a record that has a field with a built-in Avro logical type (for example aBigDecimal
field with BYTES type and decimal logical type), you can now properly serialize the records. After deserialization, aGenericRecord
is returned, including the typedBigDecimal
field, instead of aByteBuffer
. Logical type conversion can be enabled using thelogical.type.conversion.enabled
property. This property is set to false by default for backward compatibility.
For more information, see the following resources: - Principal mapping rules can be defined without quotes
- The SSL Client Authentication Mapping Rules
(
schema.registry.ssl.principal.mapping.rules
) property now accepts rules that are defined without quotes. As a result, when adding multiple rules, you no longer need to enclose each rule in quotes. - Remove modules section from registry.yaml
- In previous versions, the
registry.yaml
configuration file contained a modules section. This section was used to list pluggable modules that extended Schema Registry’s functionality. However, modules were never fully supported and have been removed in a previous release. The modules section inregistry.yaml
was kept for backwards compatibility. Starting with this version, the modules section is removed by default fromregistry.yaml
.
Streams Messaging Manager
- UI updates
- The style of SMM UI is updated. This update includes various changes to the colors,
fonts, and overall style of the UI. Additionally, the following functional changes and
improvements are made:
- Data Explorer
-
- The modal window that you use to view messages now includes a copy to clipboard button if the message you are viewing is long.
- A (Refresh) option is added next to the FROM OFFSET field. This option refreshes the partition offset range and fetches the latest messages.
- Connector Configuration and Connector Settings pages
-
- A new option, Add, is added to the Import a Connector config… modal. This option enables you to import connector configuration properties without overriding existing properties.
- Property keys can now be filtered based on their group and importance.
- A Reset Filters option is added, this option resets all search filters.
- Three new actions are added that modify the configuration as a whole. The options are Remove all, Reset, and Export. These actions are available in a new Actions drop-down.
- The Import Connector Configuration… option is moved to the Actions drop-down and is renamed to Import.
- The Deployment Status modal now correctly displays the status of the deployment process.
- An error message is added that notifies you if validation errors are found for properties that are currently filtered.
- If available, the display names of configuration property keys are displayed above the property key.
- Highly available Kafka Connect integration
- SMM uses the Kafka Connect service role’s REST URL to establish a connection with
Connect and serve Connect metrics. Previously, even if your Connect deployment was
highly available and had multiple service roles deployed, SMM could only be configured
with a single connection URL. From now on, multiple URLs can be configured. If the
Connect service role that SMM is connected to fails, SMM automatically connects to a
different instance that is available.
As a result of this change, the Kafka Connect Host and Kafka Connect Port properties are replaced by the Kafka Connect Rest HostPorts property. If Kafka Connect Rest HostPorts is left empty (default), SMM is automatically configured with the host, port, and protocol of the Connect service role instances belonging to the Kafka service selected with the Kafka Service SMM property.
If you previously configured Kafka Connect Host and Kafka Connect Port, the values set in the properties are automatically migrated to Kafka Connect Rest HostPorts when you upgrade.
- The SMM API now hides email notifier SMTP passwords in its response
-
Previously the
/notifiers
endpoint returned the full configuration of the notifier. In the case of email notifiers, the configuration included the password of the SMTP server. API responses from now on do not include the password. As result of this change, the PASSWORD field of existing email notifiers is left blank when you edit them. If you decide to edit the notifier you must reenter the password.
Streams Replication Manager
- Improved SRM logging
-
SRM’s logging capabilities are improved. From now on:
- Kafka clients created by SRM’s internal connectors reference the replication flow they are a part of (KAFKA-14838 backport).
- SRM now includes references to the replication flow in the log context of its internal connectors.
These changes enable differentiation between the logs associated with each replication flow.
Cruise Control
- Rebasing Cruise Control to 2.5.116
- Cruise Control in Cloudera Runtime is rebased to the 2.5.116 version. For more information about the fixes and features in Cruise Control 2.5.116, see the Cruise Control Rebase Summary.
- New endpoint for Cruise Control
- The
GET/kafkacruisecontrol/permissions
endpoint is added to Cruise Control that lists the level permissions of the current user. In case authentication is not configured for a user, the GET call returnsUnable to retrieve privilege information for an unsecure connection
message.