Deploying Kafka

You can deploy a Kafka cluster by creating a Kafka and KafkaNodePool resource in the Kubernetes environment. Following cluster deployment you can validate your cluster with the console producer and consumer tools shipped with Kafka.

Deploying a Kafka cluster

Learn how to deploy a Kafka cluster with the Strimzi Cluster Operator using Kafka and KafkaNodePool resources.

To deploy a Kafka cluster, you create two resources in the Kubernetes cluster. A Kafka resource and one or more KafkaNodePool resources. Based on these resources, the Strimzi Cluster Operator deploys the Kafka cluster.

The Kafka resource describes a Kafka cluster instance. This resource specifies the following about Kafka:
  • Kafka configuration that is common for the whole Kafka cluster (Kafka version, cluster name, and so on)
  • ZooKeeper configuration
  • Cruise Control configuration
  • Entity Operator configuration

A KafkaNodePool resource refers to a distinct group of Kafka nodes within a Kafka cluster. Using node pools enables you to specify different configurations for each node within the same Kafka cluster. Configuration options not specified in the node pool are inherited from the Kafka configuration.

You can deploy a Kafka cluster with one or more node pools. The number of node pools you create depends on how many groups of Kafka brokers you want to have that have differing configurations. The node pool configuration includes mandatory and optional settings. Configuration for replicas, roles, and storage is mandatory.

  • Ensure that the Strimzi Cluster Operator is installed and running.
  • Ensure that the Secret containing credentials for the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted is available in the namespace where you plan on deploying your cluster. If the secret is not available, create it.
    kubectl create secret docker-registry [***SECRET NAME***] \
      --docker-server [***REGISTRY***] \
      --docker-username [***USERNAME***] \
      --docker-password [***PASSWORD***] \
      --namespace [***NAMESPACE***]
    • [***SECRET NAME***] must be the same as the name of the Secret containing registry credentials that you created during Strimzi installation.
    • Replace [***REGISTRY***] with the server location of the Docker registry where Cloudera Streams Messaging - Kubernetes Operator artifacts are hosted. If your Kubernetes cluster has internet access, use container.repository.cloudera.com. Otherwise, enter the server location of your self-hosted registry.
    • Replace [***USERNAME***] and [***PASSWORD***] with credentials that provide access to the registry. If you are using container.repository.cloudera.com, use your Cloudera credentials. Otherwise, enter credentials providing access to your self-hosted registry.
  • The following steps contain an example Kafka and KafkaNodePool resource. You can find additional examples on the Cloudera Archive.
  1. Create a YAML configuration containing both your Kafka and KafkaNodePool resource manifests.
    The following examples deploy a simple Kafka cluster with three replicas in a single node pool.
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: first-pool
      labels:
        strimzi.io/cluster: my-cluster
    spec:
      replicas: 3
      roles:
        - broker
      storage:
        type: jbod
        volumes:
          - id: 0
            type: persistent-claim
            size: 100Gi
            deleteClaim: false
    ---
    apiVersion: kafka.strimzi.io/v1beta2
    kind: Kafka
    metadata:
      name: my-cluster
      annotations:
        strimzi.io/node-pools: enabled
    spec:
      kafka:
        version: 3.8.0.1.2
        listeners:
          - name: plain
            port: 9092
            type: internal
            tls: false
          - name: tls
            port: 9093
            type: internal
            tls: true
        config:
          offsets.topic.replication.factor: 3
          transaction.state.log.replication.factor: 3
          transaction.state.log.min.isr: 2
          default.replication.factor: 3
          min.insync.replicas: 2
      zookeeper:
        replicas: 3
        storage:
          type: persistent-claim
          size: 100Gi
          deleteClaim: false
      cruiseControl: {}
      entityOperator:
        topicOperator: {}
        userOperator: {}
    
    • The spec.kafka.version property in the Kafka resource must specify a Cloudera Kafka version supported by Cloudera Streams Messaging - Kubernetes Operator. For example, 3.8.0.1.2. Do not add Apache Kafka versions, they are not supported. You can find a list of supported Kafka versions in the Release Notes.

    • You can find additional information about the properties configured in this example in the Strimzi and Apache Kafka documentation.
  2. Deploy the cluster.
    kubectl apply --filename [***YAML CONFIG***] --namespace [***NAMESPACE***]
  3. Verify that pods are created.
    kubectl get pods --namespace [***NAMESPACE***]

    If cluster deployment is successful, you should see an output similar to the following.

    NAME                                          READY   STATUS    RESTARTS 
    my-cluster-entity-operator-79846c5cbd-jqn9k   2/2     Running   0 
    my-cluster-cruise-control-8475c5gdw0-juqi7h   1/1     Running   0 
    my-cluster-first-pool-0                       1/1     Running   0
    my-cluster-first-pool-1                       1/1     Running   0 
    my-cluster-first-pool-2                       1/1     Running   0
    my-cluster-zookeeper-0                        1/1     Running   0  
    my-cluster-zookeeper-1                        1/1     Running   0 
    my-cluster-zookeeper-2                        1/1     Running   0 
    strimzi-cluster-operator-5b465446b8-jfpmr     1/1     Running   0
    

    The READY column shows the number of ready and total containers inside the pod, while the STATUS column shows if the pod is running or not.

Validating a Kafka cluster

Validate your deployment using Kafka command line tools.

After the Kafka broker pods are successfully started, you can use the Kafka console producer and consumer to validate the environment. The following steps use the exact same docker images that were used to deploy the Kafka cluster by the Strimzi Cluster Operator. The images contain all the Kafka built-in tools and you can start a custom Kubernetes pod, starting the Kafka tools in the containers.

The following example commands assume that the cluster is configured with PLAINTEXT authentication and credentials do not need to be provided. If your cluster is secured, you will need to pass the corresponding security parameters in the command line as well.
  1. Create a topic.
    IMAGE=$(kubectl get pod [***BROKER POD***] --namespace [***NAMESPACE***] --output jsonpath='{.spec.containers[0].image}')
    kubectl run kafka-admin -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-topics.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --create \
        --topic my-topic
  2. Produce message to the topic using the Kafka console producer.
    kubectl run kafka-producer -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-console-producer.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --topic my-topic

    Start typing to produce messages.

    >hello
    >csm
    >operator
    >^C
  3. Consume the messages using the Kafka Console consumer.
    kubectl run kafka-consumer -it \
      --namespace [***NAMESPACE***] \
      --image=$IMAGE \
      --rm=true \
      --restart=Never \
      --command -- /opt/kafka/bin/kafka-console-consumer.sh \
        --bootstrap-server [***CLUSTER NAME***]-kafka-bootstrap:9092 \
        --topic my-topic \
        --from-beginning

    If successful, the messages you produced are printed on the output.

    >hello
    >csm
    >operator