What's New in Cloudera Data Flow for Data Hub

This section lists major features and updates for Cloudera Data Flow for Data Hub.

May 1, 2020

This release introduces the following cluster definitions for Flow Management in CDP Public Cloud:

  • Flow Management Light Duty for AWS
  • Flow Management Light Duty for Azure
  • Flow Management Heavy Duty for AWS
  • Flow Management Heavy Duty for Azure

These cluster definitions support installing Flow Management clusters running Apache NiFi and Apache NiFi Registry

Flow Management delivers high-scale data ingestion, transformation, and management to enterprises from any-to-any environment. It addresses key enterprise use cases such as data movement, continuous data ingestion, log data ingestion, and acquisition of all types of streaming data including social, mobile, clickstream, and IoT data.

The Flow Management template includes a no-code data ingestion and management solution powered by Apache NiFi. With NiFi’s intuitive graphical interface and 300+ processors, Flow Management enables easy data ingestion and movement between CDP services as well as 3rd party cloud services. NiFi Registry is automatically set up and provides a central place to manage versioned Data Flows.

December 18, 2019

This release introduces the technical preview release of Streams Messaging cluster definitions for installation using Data Hub. The Streams Messaging templates include Kafka, Schema Registry, Streams Messaging Manager, and ZooKeeper. There are two template options, depending on your operational objectives:

  • Streams Messaging Heavy Duty for AWS
  • Streams Messaging Heavy Duty for Azure
  • Streams Messaging Light Duty for AWS
  • Streams Messaging Light Duty for Azure

Streams Messaging provides advanced messaging and real-time processing on streaming data using Apache Kafka, centralized schema management using Schema Registry, as well as management and monitoring capabilities powered by Streams Messaging Manager.

These templates set up fault-tolerant standalone deployments of Apache Kafka and supporting Cloudera components (Schema Registry and Streams Messaging Manager), which can be used for Kafka workloads in the cloud or as a disaster recovery instance for on-prem Kafka clusters.