Deployment and ConfigurationPDF version

Deploying Kafka

You deploy a Kafka cluster by creating a Kafka resource and one or more KafkaNodePool resources in the Kubernetes cluster. The Kafka cluster can use either KRaft (recommended) or ZooKeeper (deprecated) for metadata management. After cluster deployment you can validate your cluster with the console producer and consumer tools shipped with Kafka.

The Kafka resource describes a Kafka cluster instance. This resource specifies the following about Kafka:
  • Kafka configuration that is common for the whole Kafka cluster (Kafka version, cluster name, and so on)
  • Cruise Control configuration
  • Entity Operator configuration
  • ZooKeeper configuration (if ZooKeeper is used instead of KRaft)

A KafkaNodePool resource refers to a distinct group of Kafka nodes within a Kafka cluster. Using node pools enables you to specify different configurations for each node within the same Kafka cluster. Configuration options not specified in the node pool are inherited from the Kafka configuration.

You can deploy a Kafka cluster with one or more node pools. The number of node pools you create depends on how many groups of Kafka brokers you want to have that have differing configurations. The node pool configuration includes mandatory and optional settings. Configuration for replicas, roles, and storage is mandatory.

You can deploy Kafka in either KRaft or ZooKeeper mode. However, Cloudera recommends that you deploy clusters in KRaft mode. This is because ZooKeeper-based clusters are deprecated. Additionally, ZooKeeper will be removed in a future release.

KRaft offers enhanced reliability, scalability, and throughput over ZooKeeper. Metadata operations are more efficient as they are directly integrated. Additionally, when using KRaft, you are no longer required to maintain ZooKeeper, which reduces operational overhead.

When you deploy a Kafka cluster in KRaft mode, you assign roles to each node in the Kafka cluster. Roles are assigned in the KafkaNodePool resource. There are two roles, broker and controller.

  • Broker ‐ These nodes manage Kafka records stored in topic partitions. Nodes with the broker role are your Kafka brokers.

  • Controller ‐ These nodes manage cluster metadata and the state of the cluster using a Raft-based consensus protocol. Controller nodes are the KRaft equivalent of ZooKeeper nodes.

A single Kafka node can have a single role or both roles. If you assign both roles to the node, it performs both broker and controller tasks. Depending on role assignments, your cluster will be running in one of the following modes.

  • KRaft mode ‐ In this mode, each Kafka node is either a broker or controller. Recommended for production clusters.

  • KRaft combined mode ‐ In this mode, some or all nodes in the cluster have both controller and broker roles assigned to them.

Combined mode is not recommended or supported for production environments. Use combined mode in development environments. Cloudera recommends that you always fully separate controller and broker nodes to avoid resource contention between roles.

You deploy a Kafka cluster in KRaft mode by deploying a Kafka resource and at least two KafkaNodePool resources. One KafkaNodepool describes your brokers, the other describes KRaft controllers. The Kafka resource must include the strimzi.io/kraft="enabled" annotation. Deploying Kafka in KRaft mode is the recommended mode for deployment.

  • Ensure that the Strimzi Cluster Operator is installed and running. See Installation.

  • Ensure that a namespace is available where you can deploy your cluster. If not, create one.

    kubectl create namespace[***NAMESPACE***]
  • Ensure that the Secret containing credentials for the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted is available in the namespace where you plan on deploying your cluster. If the secret is not available, create it.
    kubectl create secret docker-registry [***SECRET NAME***] \
      --docker-server [***REGISTRY***] \
      --docker-username [***USERNAME***] \
      --docker-password [***PASSWORD***] \
      --namespace [***NAMESPACE***]
    • [***SECRET NAME***] must be the same as the name of the Secret containing registry credentials that you created during Strimzi installation.

    • Replace [***REGISTRY***] with the server location of the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted. If your Kubernetes cluster has internet access, use container.repository.cloudera.com. Otherwise, enter the server location of your self-hosted registry.

    • Replace [***USERNAME***] and [***PASSWORD***] with credentials that provide access to the registry. If you are using container.repository.cloudera.com, use your Cloudera credentials. Otherwise, enter credentials providing access to your self-hosted registry.

  • Scaling node pools that include KRaft controllers (controller roles) is not possible.

  • The following steps contain Kafka and KafkaNodePool resource examples. You can find additional examples on the Cloudera Archive.

  1. Create a YAML configuration containing your Kafka resource manifest.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: Kafka
    metadata:
      name: my-cluster
      annotations:
        strimzi.io/node-pools: enabled
        strimzi.io/kraft: enabled
    spec:
      kafka:
        version: 3.9.0.1.3
        listeners:
          - name: plain
            port: 9092
            type: internal
            tls: false
          - name: tls
            port: 9093
            type: internal
            tls: true
        config:
          offsets.topic.replication.factor: 3
          transaction.state.log.replication.factor: 3
          transaction.state.log.min.isr: 2
          default.replication.factor: 3
          min.insync.replicas: 2
      entityOperator:
        topicOperator: {}
        userOperator: {}
    • strimzi.io/node-pools: enabled - Enables Kafka node pools. KRaft mode is only supported with node pools.

    • strimzi.io/kraft: enabled - Enables KRaft mode for the cluster.

    • spec.kafka.version - Specifies the Kafka version to use. Must specify a Cloudera Kafka version supported by Cloudera Streams Messaging - Kubernetes Operator. For example, 3.9.0.1.3. Do not add Apache Kafka versions, they are not supported. You can find a list of supported Kafka versions in the Release Notes.

  2. Create a YAML configuration containing your KafkaNodePool resource manifest for brokers.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: broker
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - broker
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 10Gi
            kraftMetadata: shared
            deleteClaim: false
    • spec.roles - Specifies the roles of the nodes in this pool. The value broker means that the replicas in this node pool are all brokers.

    • spec.storage.volumes.kraftMetadata - Specifies whether a volume should be used for storing KRaft metadata. Used to specify which volume should be used to store metadata. In this example, volume 0 is specified for storage. This property is optional.

  3. Create a YAML configuration containing your KafkaNodePool resource manifest for KRaft controllers.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: controller
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - controller
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 10Gi
            kraftMetadata: shared
            deleteClaim: false
    • spec.roles - Specifies the roles of the nodes in this pool. The value controller means that the replicas in this node pool are all KRaft controllers.

    • spec.storage.volumes.kraftMetadata - Specifies whether a volume should be used for storing KRaft metadata. Used to specify which volume should be used to store metadata. In this example, volume 0 is specified for storage. This property is optional.

  4. Deploy the cluster.
    kubectl apply \
      --filename [**KAFKA YAML***],[***BROKER NODE POOL YAML***],[***CONTROLLER NODE POOL YAML***] \
      --namespace [***NAMESPACE***]
  5. Verify that pods are created.
    kubectl get pods --namespace [***NAMESPACE***]
    If cluster deployment is successful, you should see an output similar to the following.
    NAME                                          READY   STATUS    RESTARTS       
    my-cluster-broker-0                           1/1     Running   0              
    my-cluster-broker-1                           1/1     Running   0              
    my-cluster-broker-2                           1/1     Running   0              
    my-cluster-controller-3                       1/1     Running   0              
    my-cluster-controller-4                       1/1     Running   0              
    my-cluster-controller-5                       1/1     Running   0              
    my-cluster-entity-operator-858b7649df-v8jth   2/2     Running   0              
    strimzi-cluster-operator-589f9fd659-4bqnp     1/1     Running   0              

    The READY column shows the number of ready and total containers inside the pod, while the STATUS column shows if the pod is running or not.

    In this example there are a total of six nodes (each node is a pod). Three are dedicated brokers, the other three are dedicated controllers.

Validate your cluster. Complete Validating a Kafka cluster.

You deploy a Kafka cluster in KRaft combined mode by deploying a Kafka resource and one or more KafkaNodePool resources. Typically you create two node pools, one describing nodes with both roles, and one that describes nodes that have the broker role only. Alternatively, you can create clusters where all nodes have both roles. In this case, a single node pool is sufficient. The Kafka resource must include the strimzi.io/kraft="enabled" annotation.

  • Ensure that the Strimzi Cluster Operator is installed and running. See Installation.

  • Ensure that a namespace is available where you can deploy your cluster. If not, create one.

    kubectl create namespace[***NAMESPACE***]
  • Ensure that the Secret containing credentials for the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted is available in the namespace where you plan on deploying your cluster. If the secret is not available, create it.
    kubectl create secret docker-registry [***SECRET NAME***] \
      --docker-server [***REGISTRY***] \
      --docker-username [***USERNAME***] \
      --docker-password [***PASSWORD***] \
      --namespace [***NAMESPACE***]
    • [***SECRET NAME***] must be the same as the name of the Secret containing registry credentials that you created during Strimzi installation.

    • Replace [***REGISTRY***] with the server location of the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted. If your Kubernetes cluster has internet access, use container.repository.cloudera.com. Otherwise, enter the server location of your self-hosted registry.

    • Replace [***USERNAME***] and [***PASSWORD***] with credentials that provide access to the registry. If you are using container.repository.cloudera.com, use your Cloudera credentials. Otherwise, enter credentials providing access to your self-hosted registry.

  • Scaling node pools that include KRaft controllers (controller roles) is not possible.

    Because of this limitation, you can only scale clusters running in combined mode if the cluster includes a node pool that has broker nodes only. The examples in these steps set up a broker-only node pool.

  • Ranger authorization does not work with combined mode.

  • The following steps contain Kafka and KafkaNodePool resource examples. You can find additional examples on the Cloudera Archive.

  1. Create a YAML configuration containing your Kafka resource manifest.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: Kafka
    metadata:
      name: my-cluster
      annotations:
        strimzi.io/node-pools: enabled
        strimzi.io/kraft: enabled
    spec:
      kafka:
        version: 3.9.0.1.3
        listeners:
          - name: plain
            port: 9092
            type: internal
            tls: false
          - name: tls
            port: 9093
            type: internal
            tls: true
        config:
          offsets.topic.replication.factor: 3
          transaction.state.log.replication.factor: 3
          transaction.state.log.min.isr: 2
          default.replication.factor: 3
          min.insync.replicas: 2
      entityOperator:
        topicOperator: {}
        userOperator: {}
    
    • strimzi.io/node-pools: enabled - Enables Kafka node pools. KRaft mode is only supported with node pools.

    • strimzi.io/kraft: enabled - Enables KRaft mode for the cluster.

    • spec.kafka.version - Specifies the Kafka version to use. Must specify a Cloudera Kafka version supported by Cloudera Streams Messaging - Kubernetes Operator. For example, 3.9.0.1.3. Do not add Apache Kafka versions, they are not supported. You can find a list of supported Kafka versions in the Release Notes.

  2. Create a YAML configuration containing your KafkaNodePool resource manifests.
    The configuration and number of KafkaNodePools you create depends on the deployment architecture that you want.

    The following example creates two KafkaNodePools. One node pool specifies both the broker and controller roles. These nodes will run in combined mode. Additionally, a second node pool is created that includes broker nodes only.

    The second node pool is added because node pools that include controller nodes cannot be scaled. Creating a separate node pool for brokers when you first deploy the cluster makes it easier to scale the cluster in the future.

    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: combined
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - controller
        - broker
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 10Gi
            kraftMetadata: shared
            deleteClaim: false
    ---
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: broker-only
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - broker
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 10Gi
            kraftMetadata: shared
            deleteClaim: false
    
    • spec.roles - Specifies the roles of the nodes in the pool. The combined node pool has both the controller and broker roles specified. Therefore, the three Kafka nodes described in the combined node pool operate in combined mode. On the other hand, the broker-only node pool has broker specified as the role. The three Kafka nodes described by the broker-only pool operate as brokers.

    • spec.storage.volumes.kraftMetadata - Specifies whether a volume should be used for storing KRaft metadata. Used to specify which volume should be used to store metadata. In this example, volume 0 is specified for storage. This property is optional.

  3. Deploy the cluster.
    kubectl apply \
      --filename [**KAFKA YAML***],[***NODE POOL YAML***] \
      --namespace [***NAMESPACE***]
  4. Verify that pods are created.
    kubectl get pods --namespace [***NAMESPACE***]
    If cluster deployment is successful, you should see an output similar to the following.
    NAME                                          READY   STATUS    RESTARTS       
    my-cluster-broker-only-0                      1/1     Running   0              
    my-cluster-broker-only-1                      1/1     Running   0              
    my-cluster-broker-only-2                      1/1     Running   0              
    my-cluster-combined-3                         1/1     Running   0              
    my-cluster-combined-4                         1/1     Running   0              
    my-cluster-combined-5                         1/1     Running   0              
    my-cluster-entity-operator-74c95d6667-rstkf   2/2     Running   0              
    strimzi-cluster-operator-589f9fd659-4bqnp     1/1     Running   0              

    The READY column shows the number of ready and total containers inside the pod, while the STATUS column shows if the pod is running or not.

    In this example, there are a total of six nodes (each node is a pod). Nodes 0, 1, and 2 are brokers, while nodes 3, 4, and 5 are both brokers and controllers. This means the cluster has a total of six brokers and three controllers.

Validate your cluster. Complete Validating a Kafka cluster.

You deploy a Kafka cluster in ZooKeeper mode by deploying a Kafka resource and at least a single KafkaNodePool resource. The Kafka resource must include ZooKeeper configuration.

  • Ensure that the Strimzi Cluster Operator is installed and running. See Installation.

  • Ensure that a namespace is available where you can deploy your cluster. If not, create one.

    kubectl create namespace[***NAMESPACE***]
  • Ensure that the Secret containing credentials for the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted is available in the namespace where you plan on deploying your cluster. If the secret is not available, create it.
    kubectl create secret docker-registry [***SECRET NAME***] \
      --docker-server [***REGISTRY***] \
      --docker-username [***USERNAME***] \
      --docker-password [***PASSWORD***] \
      --namespace [***NAMESPACE***]
    • [***SECRET NAME***] must be the same as the name of the Secret containing registry credentials that you created during Strimzi installation.

    • Replace [***REGISTRY***] with the server location of the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted. If your Kubernetes cluster has internet access, use container.repository.cloudera.com. Otherwise, enter the server location of your self-hosted registry.

    • Replace [***USERNAME***] and [***PASSWORD***] with credentials that provide access to the registry. If you are using container.repository.cloudera.com, use your Cloudera credentials. Otherwise, enter credentials providing access to your self-hosted registry.

  • The following steps contain Kafka and KafkaNodePool resource examples. You can find additional examples on the Cloudera Archive.

  1. Create a YAML configuration containing both your Kafka and KafkaNodePool resource manifests.
    The following examples deploy a simple Kafka cluster with three replicas in a single node pool.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: first-pool
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - broker
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 100Gi
            deleteClaim: false
    ---
    apiVersion: kafka.strimzi.io/v1beta2
    kind: Kafka
    metadata:
      name: my-cluster
      annotations:
        strimzi.io/node-pools: enabled
    spec:
      kafka:
        version: 3.9.0.1.3
        listeners:
          - name: plain
            port: 9092
            type: internal
            tls: false
          - name: tls
            port: 9093
            type: internal
            tls: true
        config:
          offsets.topic.replication.factor: 3
          transaction.state.log.replication.factor: 3
          transaction.state.log.min.isr: 2
          default.replication.factor: 3
          min.insync.replicas: 2
      zookeeper:
        replicas: 3
        storage:
          type: persistent-claim
          size: 100Gi
          deleteClaim: false
      cruiseControl: {}
      entityOperator:
        topicOperator: {}
        userOperator: {}
    
    • The spec.kafka.version property in the Kafka resource must specify a Cloudera Kafka version supported by Cloudera Streams Messaging - Kubernetes Operator. For example, 3.9.0.1.3. Do not add Apache Kafka versions, they are not supported. You can find a list of supported Kafka versions in the Release Notes.

    • You can find additional information about the properties configured in this example in the Strimzi and Apache Kafka documentation.

  2. Deploy the cluster.
    kubectl apply --filename [***YAML CONFIG***] --namespace [***NAMESPACE***]
  3. Verify that pods are created.
    kubectl get pods --namespace [***NAMESPACE***]
    If cluster deployment is successful, you should see an output similar to the following.
    NAME                                          READY   STATUS    RESTARTS 
    my-cluster-entity-operator-79846c5cbd-jqn9k   2/2     Running   0 
    my-cluster-cruise-control-8475c5gdw0-juqi7h   1/1     Running   0 
    my-cluster-first-pool-0                       1/1     Running   0
    my-cluster-first-pool-1                       1/1     Running   0 
    my-cluster-first-pool-2                       1/1     Running   0
    my-cluster-zookeeper-0                        1/1     Running   0  
    my-cluster-zookeeper-1                        1/1     Running   0 
    my-cluster-zookeeper-2                        1/1     Running   0 
    strimzi-cluster-operator-5b465446b8-jfpmr     1/1     Running   0
    

    The READY column shows the number of ready and total containers inside the pod, while the STATUS column shows if the pod is running or not.

    In this example there are a total of six nodes (each node is a pod). Three are Kafka broker nodes, the other three are ZooKeeper nodes.

Validate your cluster. Complete Validating a Kafka cluster.

After the Kafka broker pods are successfully started, you can use the Kafka console producer and consumer to validate the cluster. The following steps use the exact same docker images that were used to deploy the Kafka cluster by the Strimzi Cluster Operator. The images contain all the Kafka built-in tools and you can start a custom Kubernetes pod, starting the Kafka tools in the containers.

The following example commands assume that the cluster is configured with PLAINTEXT authentication and credentials do not need to be provided. If your cluster is secured, you will need to pass the corresponding security parameters in the command line as well.
  1. Create a topic.
    IMAGE=$(kubectl get pod [***BROKER POD***] --namespace [***NAMESPACE***] --output jsonpath='{.spec.containers[0].image}')
    kubectl run kafka-admin -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-topics.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --create \
        --topic my-topic
  2. Produce message to the topic using the Kafka console producer.
    kubectl run kafka-producer -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-console-producer.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --topic my-topic

    Start typing to produce messages.

    >hello
    >csm
    >operator
    >^C
  3. Consume the messages using the Kafka Console consumer.
    kubectl run kafka-consumer -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-console-consumer.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --topic my-topic \
        --from-beginning

    If successful, the messages you produced are printed on the output.

    >hello
    >csm
    >operator
    

We want your opinion

How can we improve this page?

What kind of feedback do you have?