Cloudera Streams Messaging - Kubernetes Operator

Cloudera Streams Messaging - Kubernetes Operator brings enterprise-grade Kafka deployments to your existing Kubernetes infrastructure.

Key Features and benefits

A Kafka deployment with Cloudera Streams Messaging - Kubernetes Operator provides the following key features and benefits:

  • Flexible, agile, and rapid deployment as well as scaling for variable workloads
  • Standardization of deployments on existing Kubernetes infrastructure
  • Operational efficiency with simple upgrades and swift creation of new clusters
  • Ability to deploy Kafka and related components on existing, shared Kubernetes infrastructure; no need for dedicated infrastructure
  • Lightweight dependencies and system requirements for Kafka-centric deployments

Components

Cloudera Streams Messaging - Kubernetes Operator consists of and ships multiple components including Apache Kafka, Apache ZooKeeper, Cruise Control, Strimzi, and others.

Strimzi

Strimzi is an open-source project that provides a way to run an Apache Kafka cluster on Kubernetes. Strimzi makes it possible to deploy and manage Kafka workloads in a Kubernetes environment using Kubernetes-native tooling and processes.

Strimzi itself is made up of multiple components and includes various operator applications, Custom Resource Definitions (CRDs) as well as Docker (container) images.

Operator applications are purpose built Kubernetes applications that act as an extension to Kubernetes. These applications provide an easy way for you to deploy, manage, and configure Kafka and related components.

The CRDs created by Strimzi define the APIs to interface with Kafka-related custom resources on Kubernetes as, for example, KafkaCluster, KafkaNodePools, and KafkaTopic. The custom resources are created as instances of these APIs by providing an associated set of configurations to be applied to the resource. CRDs and custom resources are defined as YAML files.

Apache Kafka

Apache Kafka is an open-source, high performance, highly available, and redundant streams messaging platform. It supports millions of messages per second with low latency and high throughput, scaling elastically and transparently without downtime. Kafka addresses a wide range of streaming data initiatives, enabling enterprises to keep up with customer demand, provide better services, and proactively manage risk.

Apache ZooKeeper

Apache ZooKeeper is an open-source centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services. Kafka uses Zookeeper for broker coordination as well as to store broker, topic, and partition metadata.

Note that Cloudera will replace ZooKeeper with KRaft in a future release.

Cruise Control

Cruise Control acts as a load balancing component in large Kafka installations. It provides automatic data balancing of Kafka partitions across Kafka clusters based on user specified parameters (goals) as well as workload data.

Component and feature support

Components shipped in Cloudera Streams Messaging - Kubernetes Operator are based on open source projects and might contain additional changes or fixes to guarantee that they work in Cloudera supported environments.

Additionally, not all Kafka and Strimzi features are supported. For additional information see the Release Notes as well as Component versions.

Getting Started

To get started, install Cloudera Streams Messaging - Kubernetes Operator.

Following installation, you can deploy instances of Kafka and Kafka Connect clusters, or set up replication flows between existing clusters.

You can learn more about the various components as well as Kubernetes Operators by visiting the following upstream resources: