InstallationPDF version

Installing Cloudera Streaming Analytics - Kubernetes Operator in an air-gapped environment

Complete these steps to install Cloudera Streaming Analytics - Kubernetes Operator if your Kubernetes cluster does not have internet access, or if you want to install from a self-hosted registry. Installing the Cloudera Streaming Analytics - Kubernetes Operator enables you to deploy and manage Flink and Cloudera SQL Stream Builder in Kubernetes.

Cloudera Streaming Analytics - Kubernetes Operator is installed in your Kubernetes cluster with the provided Helm chart through the helm install command. When you install the chart, Helm installs the Custom Resource Descriptors (CRD) included in Cloudera Streaming Analytics - Kubernetes Operator, and deploys the Apache Kubernetes Flink Operator (Flink Operator), Cloudera SQL Stream Builder engine (in Technical Preview), and a PostgreSQL database for Cloudera SQL Stream Builder.

Installing Cloudera Streaming Analytics - Kubernetes Operator does not create or deploy a Flink cluster. The Flink cluster is created after the installation by deploying the Flink Deployment resource in the Kubernetes cluster with kubectl or oc, or when you execute a SQL job in Streaming SQL Console.

By default the Flink Operator (deployed with installation) watches and manages all the Flink clusters that are deployed in the single same namespace as the Flink Operator. However, you can also configure it to watch and manage multiple namespaces. This allows you to manage multiple Flink clusters deployed in different namespaces, using a single Cloudera Streaming Analytics - Kubernetes Operator installation.
  • Ensure that your Kubernetes environment meets requirements listed in System requirements.
  • A self-hosted Docker registry is required. Your registry must be accessible by your Kubernetes cluster.
  • While the Kubernetes cluster does not need internet access in an air-gapped environment, the preparation steps to create the local (offline) repository, from which you can install Cloudera Streaming Analytics - Kubernetes Operator, require that you can download and move the artifacts hosted on the Cloudera Docker registry and Cloudera Archive.
  • Access to docker or equivalent utility that you can use to pull and push images is required. The Cloudera-recommended way is using docker. Replace commands where necessary, if you use a different utility.
  • Ensure that you have access to your Cloudera credentials (username and password). Credentials are required to access the Cloudera Docker registry (and, if needed, the Cloudera Archive) where installation artifacts are hosted.
  • Ensure that you have access to a valid Cloudera license.
  • Review the Helm chart reference before installation.

    The Helm chart accepts various configuration properties that you can set during installation. Using these properties you can customize your installation.

  • If you want to use the Webhook of Flink Operator, ensure that you have cert-manager installed on your Kubernetes cluster, which you can install using the following command:
    kubectl create -f https://github.com/jetstack/cert-manager/releases/download/v1.8.2/cert-manager.yaml
    kubectl wait -n cert-manager --for=condition=Available deployment --all
    • The webhook functionality is enabled by default. You can disable it using the following command, and skip the cert-manager installation:
      --set flink-kubernetes-operator.webhook.create=false
  1. Copy the following installation artifacts to your self-hosted registry.
    Table 1. Cloudera Streaming Analytics - Kubernetes Operator artifacts on the Cloudera Docker registry
    Artifact Location Description
    Flink Kubernetes Operator Docker image container.repository.cloudera.com/cloudera/flink-kubernetes-operator:1.9-csaop1.1.2-b17 Docker image used for deploying the various operator components shipped with the Cloudera Streaming Analytics - Kubernetes Operator.
    Flink Docker image container.repository.cloudera.com/cloudera/flink:1.19.1-csaop1.1.2-b17 Docker image used for deploying Apache Flink and its related components.
    SQL Runner Docker image container.repository.cloudera.com/cloudera/ssb-sql-runner:1.19.1-csaop1.1.2-b17 Docker image used for deploying Flink application when SQL query is executed in SSB.
    SQL Stream Engine Docker image container.repository.cloudera.com/cloudera/ssb-sse:1.19.1-csaop1.1.2-b17 Docker image used for deploying Cloudera SQL Stream Builder and its UI.

    This step involves pulling the artifacts from the Cloudera Docker registry, retagging them, and then pushing them to your self-hosted registry. The exact steps you need to carry it out depend on your environment and how you set up your registry. The following substeps demonstrate a basic workflow using docker and helm.

    1. Log in to the Cloudera Docker registry with both docker and helm.
      Provide your Cloudera credentials when prompted.
      docker login container.repository.cloudera.com
      helm registry login container.repository.cloudera.com
    2. Pull the Docker images from the Cloudera Docker registry.
      docker pull \
        container.repository.cloudera.com/cloudera/[***IMAGE NAME***]:[***VERSION***]
    3. Pull the Cloudera Streaming Analytics - Kubernetes Operator Helm chart.
      helm pull \
        oci://container.repository.cloudera.com/cloudera-helm/csa-operator/csa-operator \
        --version 1.1.2-b17
    4. Retag the Docker images you pulled so that they contain the address of your registry.
      docker tag \
      [***ORIGINAL IMAGE TAG***] \
      [***REGISTRY HOSTNAME***]:[***PORT***]/cloudera/[***IMAGE NAME***]:[***VERSION***]
    5. Push the images and chart to your self-hosted registry.
      docker push \
      [***REGISTRY HOSTNAME***]:[***PORT***]/cloudera/[***IMAGE NAME***]:[***VERSION***]
      helm push \
      csa-operator-1.1.2-b17.tgz \
      oci://[***REGISTRY HOSTNAME***]:[***PORT***]/cloudera-helm/csa-operator/
  2. Create a namespace in your Kubernetes cluster where you will install and use the Cloudera Streaming Analytics - Kubernetes Operator.
    kubectl create namespace [***NAMESPACE***]
    This is the namespace where you install Flink and Cloudera SQL Stream Builder. Use this namespace you create in all installation steps that follow.
  3. Create a Kubernetes secret to credentials for your self-hosted registry.
    kubectl create secret docker-registry [***SECRET NAME***] \
        --docker-server [***REGISTRY HOSTNAME***]:[***PORT***] \
        --docker-username [***USERNAME***] \
        --docker-password [***PASSWORD***] \
        --namespace [***NAMESPACE***]
    Ensure that the placeholders are replaced with your specific information:
     Show Me How
    1. Provide a desired name for [***SECRET NAME***].
    2. Replace [***REGISTRY HOSTNAME***]:[***PORT***] with your self-hosted registry hostname and port.
    3. Replace [***USERNAME***] and [***PASSWORD***] with your Cloudera credentials.
    4. Provide the same name for [***NAMESPACE***] that you created in the previous step.
  4. Log in to your self-hosted registry with helm.
    helm registry login [***REGISTRY HOSTNAME***]:[***PORT***]
    Enter your credentials when prompted.
  5. Install Cloudera Streaming Analytics - Kubernetes Operator with helm install.
    helm install csa-operator \
        --namespace [***NAMESPACE***] \
        --set 'flink-kubernetes-operator.image.repository=[***REGISTRY HOSTNAME***]:[***PORT***]/cloudera/[***IMAGE NAME***]'\
        --set 'ssb.sqlRunner.image.repository=[***REGISTRY HOSTNAME***]:[***PORT***]/cloudera/[***IMAGE NAME***]'\
        --set 'ssb.sse.image.repository=[***REGISTRY HOSTNAME***]:[***PORT***]/cloudera/[***IMAGE NAME***]'\
        --set 'flink-kubernetes-operator.imagePullSecrets[0].name=[***SECRET NAME***]' \
        --set 'ssb.sse.image.imagePullSecrets[0].name=[***SECRET NAME***]' \
        --set 'ssb.sqlRunner.image.imagePullSecrets[0].name=[***SECRET NAME***]' \
        --set-file flink-kubernetes-operator.clouderaLicense.fileContent=[***PATH TO LICENSE FILE***] \
    oci://[***REGISTRY HOSTNAME***]:[***PORT***]/cloudera-helm/csa-operator/csa-operator --version 1.1.2-b17
    Ensure that the placeholders are replaced with your specific information:
     Show Me How
    1. Provide the same name for [***NAMESPACE***] that you created in Step 1.
    2. Replace [***REGISTRY HOSTNAME***]:[***PORT***] with your self-hosted registry hostname and port.
    3. Provide the same name for [***SECRET NAME***] that you created in the previous step. imagePullSecrets specifies what secret is used to pull images from the Cloudera registry. Setting this property is mandatory, otherwise, Helm cannot pull the necessary images from the Cloudera Docker registry.
    4. Replace [***PATH TO LICENSE FILE***] with the full (absolute) path to your Cloudera license file. clouderaLicense.fileContent is used to register your license. When this property is set, a secret is generated that contains the license you specify. Setting this property is mandatory. The Cloudera Streaming Analytics - Kubernetes Operator will not function without a valid license.
    5. You can use --set to set various other properties of the Helm chart. This enables you to customize your installation. (For more information on the available properties, see Helm chart reference.) For example, by default the Flink Operator has access to watch all namespaces. However, you can configure a list of specific namespaces to watch using watchNamespaces.. For example, in case you created multiple namespaces, you can configure the Flink Operator to only watch specific ones with --set flink-kubernetes-operator.watchNamespaces={[***NAMESPACE1***], [***NAMESPACE2***]}. For more information about deploying Flink and Cloudera SQL Stream Builder in multiple namespaces, see the Namespace management documentation.
  6. Check that the Flink Operator, and the Cloudera SQL Stream Builder engine with its database are running.
    kubectl get pods -n [***NAMESPACE***]
    NAME                            READY   STATUS      RESTARTS    AGE
    flink-kubernetes-operator       1/2     Running      0          7s
    ssb-postgresql                  1/1     Running      0          7s
    ssb-sse                         1/1     Running      0          7s
After successfully installing the Cloudera Streaming Analytics - Kubernetes Operator, you can start using Flink and Cloudera SQL Stream Builder (in Technical Preview) on Kubernetes. The Getting Started with Flink and Getting Started with Cloudera SQL Stream Builder guides can help you with the basic operations.