What's New in Cloudera Data Flow for Data Hub

This section lists major features and updates for Cloudera Data Flow for Data Hub.

June 30, 2020

General availability release for Streams Messaging clusters

This release introduces the GA release of Streams Messaging cluster definitions and templates for installation using Data Hub. The Streams Messaging templates include Kafka, Schema Registry, Streams Messaging Manager, and ZooKeeper. There are four template options, depending on your cloud provider and operational objectives:

  • Streams Messaging Heavy Duty for AWS
  • Streams Messaging Heavy Duty for Azure
  • Streams Messaging Light Duty for AWS
  • Streams Messaging Light Duty for Azure

Streams Messaging provides advanced messaging and real-time processing on streaming data using Apache Kafka, centralized schema management using Schema Registry, as well as management and monitoring capabilities powered by Streams Messaging Manager.

These templates set up fault-tolerant standalone deployments of Apache Kafka and supporting Cloudera components (Schema Registry and Streams Messaging Manager), which can be used for Kafka workloads in the cloud or as a disaster recovery instance for on-premises Kafka clusters.

June 16, 2020

Rebase on Kafka 2.4.1

Kafka shipped with this version of Cloudera Runtime is based on Apache Kafka 2.4.1. For more information, see Apache Kafka Notable Changes for versions 2.4.0 and 2.4.1, as well as the Apache Kafka Release Notes for versions 2.4.0 and 2.4.1 in the upstream documentation.

Schema Registry High Availability

The Streams Messaging Heavy Duty cluster definitions are configured to include a highly available Schema Registry. Schema Registry is now available on two cluster nodes with an external database. For updated cluster layout information, see the cluster layout information in Planning your Streams Messaging Deployment.

Schema Registry Ranger integration

Schema Registry is now automatically integrated with Ranger. When you create a Streams Messaging cluster you will see a new corresponding entry in the environment's Ranger instance where you can define authorization policies on schemas. You can define policies on entire schemas, a specific branch or even a specific version of a schema. For more information, see Securing Schema Registry.

Updated Streams Messaging: Heavy Duty cluster layout

The cluster nodes have been updated with clearer node definitions and roles. This means that your Streams Messaging: Heavy Duty cluster will consume one more instance, by default. For updated cluster layout information, see Planning your Streams Messaging Deployment.

Additionally, the Heavy Duty template now provisions an external HA database to store service information in a fault-tolerant way. Specifically for Streams Messaging clusters, this covers the Schema Registry database and SMM database.

Updated Streams Messaging: Light Duty cluster layout

The Heavy Duty template now provisions an external non-HA database to store service information externally. Specifically for Streams Messaging clusters, this covers the Schema Registry database and SMM database.