Cloudera Streaming Analytics Cloudera Data Hub cluster definitons

There are four cluster definitions available to deploy Cloudera Streaming Analytics in Cloudera on cloud. You can choose from Light and Heavy duty options, and you can further select the cluster definitions depending on your cloud provider.

You can choose from the following template options based on your operational objectives:
  • Cloudera Streaming Analytics Light Duty for AWS
  • Cloudera Streaming Analytics Light Duty for Azure
  • Cloudera Streaming Analytics Light Duty for GCP
  • Cloudera Streaming Analytics Heavy Duty for AWS
  • Cloudera Streaming Analytics Heavy Duty for Azure
  • Cloudera Streaming Analytics Heavy Duty for GCP

Cloudera Streaming Analytics offers real-time stream processing and stream analytics with low-latency and high scaling capabilities powered by Apache Flink.

Cloudera Streaming Analytics templates include Apache Flink that works out of the box in stateless or heavy state environments. Beside Flink, the template includes its supporting services namely YARN, Zookeeper and HDFS. The Heavy Duty template comes preconfigured with RocksDB as state backend, while Light Duty clusters use the default Heap state backend. You can create your streaming application by choosing between Kafka, Kudu, and HBase as datastream connectors.

You can also use SQL to query real-time data with Cloudera SQL Stream Builder in the Cloudera Streaming Analytics template. By supporting the Cloudera SQL Stream Builder service in Cloudera on cloud, you can simply and easily declare expressions that filter, aggregate, route, and otherwise mutate streams of data. Cloudera SQL Stream Builder is a job management interface that you can use to compose and run SQL on streams, as well as to create durable data APIs for the results.

In Cloudera on cloud, you can use the predefined cluster definitions within your environment to connect Flink and Cloudera SQL Stream Builder with the supported service connectors.

Table 1. Connector support with Cluster Definitions
Cluster Definition Cloudera SQL Stream Builder Connector1 Flink Connector
Streams Messaging Kafka, Schema Registry Kafka, Schema Registry
Real-Time Data Mart Kudu Kudu
Data Engineering Hive2 Hive
Operational Database - HBase
1 The Cloudera SQL Stream Builder connectors are implemented as the Flink SQL connectors.
2 You can also use the available Hive service from the Data Lake of the environment.