Installing Schema Registry with Helm

Learn how to install Schema Registry in Cloudera Streams Messaging Operator for Kubernetes with Helm. Schema Registry is a standalone application that allows you to efficiently store and manage schemas for your streaming data.

Schema Registry is installed in your Kubernetes cluster with the Schema Registry Helm chart using the helm install command. When you install the chart, Helm deploys an instance of Schema Registry, which provides you with schema storage and management capabilities.

During installation, you configure Schema Registry using a custom values file (values.yaml) passed to the Helm chart with the --values (-f) option. This file contains properties for configuring Schema Registry, including network access, database connectivity, and security settings for TLS and OAuth authentication. Additionally, some properties are configured with --set options.

Installation instructions are provided for the following scenarios.

  • Installing in an internet environment – Follow these steps to install a fully secure instance of Schema Registry in a Kubernetes cluster with internet access.

  • Installing for evaluation – Follow these steps to install an unsecure instance of Schema Registry for development or proof of concept purposes.

Installing Schema Registry in an internet environment

Complete these steps to install Schema Registry if your Kubernetes cluster has internet access. These steps install a fully secure instance of Schema Registry that has authentication, authorization, and channel encryption configured, leveraging a PostgreSQL database for persistent schema storage.

  • General prerequisites:
    • Your Kubernetes environment meets requirements listed in System requirements.

    • Your Kubernetes cluster requires internet connectivity to complete these steps. It must be able to reach the Cloudera Docker registry.

    • You have access to your Cloudera credentials (username and password). Credentials are required to access the Cloudera Archive and Cloudera Docker registry where installation artifacts are hosted.

    • You have access to a valid Cloudera license.

    • Review the Helm chart reference before installation.

      The Helm chart accepts various configuration properties that you can set during installation. Using these properties you can customize your installation.

  • Prerequisites for channel encryption (TLS):
    • An Ingress controller is installed in your Kubernetes cluster. These steps use the Ingress-Nginx controller.

    • Optional: cert-manager is installed in your Kubernetes cluster.

  • Prerequisites for OAuth authentication:
    • An OAuth server is available that has TLS enabled.

    • The server is accessible from the Kubernetes cluster where Schema Registry is deployed.

    • At least one client must be configured in your realm that supports Client Credentials flow (sometimes referred to as Machine-to-Machine (M2M), Service Account, or Application Permissions).

    • Identify if your OAuth server issues tokens that contain a value in the aud claim. If a value is present, note it down as you will need to provide it in your configuration. Referred to as [***OAUTH EXPECTED AUDIENCE***] in the following steps.

    • Get the JWKS endpoint URL of your OAuth server. You will need to provide it in your configuration. Schema Registry requires this endpoint to validate the signatures of incoming tokens. Referred to as [***OAUTH JWKS URL***] in the following steps.

    • Identify which JWT claim in your token contains the username to authorize. Schema Registry checks the sub claim by default. If your provider uses a different field, note it down as you will need to provide it in your configuration. Referred to as [***OAUTH PRINCIPAL CLAIM***] in the following steps.

    • Collect the usernames that you want to set as admin and read-only users. You will provide these in your configuration. Referred to as [***ADMIN USERS***] and [***READ-ONLY USERS***] in the following steps.

  • Database prerequisites for persistent storage:
    • A PostgreSQL server with TLS is available.

    • Get the JDBC URL for the PostgreSQL server. Referred to as [***POSTGRESQL JDBC URL***] in the following steps.

    • Get a username that Schema Registry can use to connect to the PostgreSQL server. Referred to as [***POSTGRESQL USERNAME***].
  1. Create a namespace in your Kubernetes cluster.
    kubectl create namespace [***NAMESPACE***]

    This is the namespace where you install Schema Registry. Use the namespace you create in all installation steps that follow.

  2. Log in to the Cloudera Docker registry with helm.
    helm registry login container.repository.cloudera.com

    Enter your Cloudera credentials when prompted.

  3. Create a Kubernetes Secret containing your Cloudera credentials.
    kubectl create secret docker-registry [***REGISTRY CREDENTIALS SECRET***] \
      --namespace [***NAMESPACE***] \
      --docker-server container.repository.cloudera.com \
      --docker-username [***USERNAME***] \
      --docker-password "$(echo -n 'Enter your Cloudera password: ' >&2; read -s password; echo >&2; echo $password)"
    • Take note of the name you specify as [***REGISTRY CREDENTIALS SECRET***]. You will need to specify the name in a later step.

    • Replace [***USERNAME***] with your Cloudera username.

    • Enter your Cloudera password when prompted.

  4. Prepare a keystore for Schema Registry.
    • If you have cert-manager available, create a Certificate resource. Take note of the Secret name you configure in spec.secretName of the Certificate resource, you will need to specify it in a later step.

    • If you are managing keys manually, create a certificate and private key and save it to a Secret. The keystore should be in PKCS12 format.
      kubectl create secret generic [***KEYSTORE SECRET NAME***] \
        --namespace [***NAMESPACE***] \
        --from-file=[***KEYSTORE SECRET KEY***]=[***PATH TO KEYSTORE.P12***] \
        --from-file=[***KEYSTORE PASSWORD SECRET KEY***]=[***PATH TO KEYSTORE PASSWORD FILE***]
      Take note of the Secret name, you will need to specify it in a later step.
  5. Prepare a certificate and private key for Ingress.
    • If you have cert-manager available, the certificate and private key for Ingress are automatically requested by the Ingress. You only need to ensure that you have a valid Issuer available in cert-manager. You specify the name of the Issuer resource in a later step.

    • If you are managing keys manually, create a certificate and private key and save it to a Secret. Take note of the Secret name, you will need to specify it in a later step.

    This Secret referred to as [***INGRESS TLS CERT SECRET***] in the following steps.
  6. Set up resources for OAuth authentication and authorization.
    1. Generate a Java truststore (PKCS12) containing the TLS certificate of the root Certificate Authority (CA) of the OAuth certificate chain.
      keytool -import -trustcacerts -file [***OAUTH ROOT CA***] \
        -keystore [***TRUSTSTORE NAME***] \
        -storepass [***TRUSTSTORE PASSWORD***] \
        -storetype PKCS12
    2. Create a Secret containing the truststore and its password.
      kubectl create secret generic [***OAUTH TRUSTSTORE SECRET NAME***] \
        --namespace [***NAMESPACE***] \
        --from-file=[***OAUTH TRUSTSTORE SECRET KEY***]=[***TRUSTSTORE NAME***] \
        --from-file=[***OAUTH TRUSTSTORE PASSWORD SECRET KEY***]=[***PATH TO TRUSTSTORE PW FILE***]
      Take note of [***OAUTH TRUSTSTORE SECRET NAME***], [***OAUTH TRUSTSTORE SECRET KEY***], and [***OAUTH TRUSTSTORE PASSWORD SECRET KEY***].
  7. Prepare Secrets for required PostgreSQL connection values.
    Typically, you will need a Secret containing the PostgreSQL server password, but additional files (for example a truststore) might be needed depending on your setup.
    1. Create a Secret containing the PostgreSQL server password
      kubectl create secret generic [***POSTGRESQL PASSWORD SECRET NAME***] \
        --namespace [***NAMESPACE***] \
        --from-file=[***POSTGRESQL PASSWORD SECRET KEY***]=[***PATH TO DATABASE PASSWORD FILE***]
    2. Create a Secret containing any additional files that you need to mount to the cluster to establish a PostgreSQL connection.
      For example you might need to provide a truststore. In a following step, the example will contain [***POSTGRESQL TRUSTSTORE SECRET NAME***] which refers to a Secret containing a truststore.
  8. Prepare a custom values file (values.yaml).

    The following example configures a fully secure deployment with a PostgreSQL database for persistent schema storage.

    tls:
      enabled: true
      keystore:
        secretKeyRef:
          name: [***KEYSTORE SECRET NAME***]
          key: [***KEYSTORE SECRET KEY***]
        password:
          secretKeyRef:
            name: [***KEYSTORE SECRET NAME***]
            key: [***KEYSTORE PASSWORD SECRET KEY***]
        type: PKCS12
    
    ingress:
      enabled: true
      className: "nginx"
      rules:
        path: "/"
        host: "my-domain.example.com"
      tls:
        enabled: true
        secretRef: [***INGRESS TLS CERT SECRET***]
      extraAnnotations:
        nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
    
    authentication:
      oauth:
        enabled: true
        jwt:
          principalClaimName: [***OAUTH PRINCIPAL CLAIM***]
          expectedAudience: [***OAUTH EXPECTED AUDIENCE***]
        jwks:
          url: [***OAUTH JWKS URL***]
          tls:
            truststore:
              secretKeyRef:
                name: [***OAUTH TRUSTSTORE SECRET NAME***]
                key: [***OAUTH TRUSTSTORE SECRET KEY***]
              password:
                secretKeyRef:
                  name: [***OAUTH TRUSTSTORE SECRET NAME***]
                  key: [***OAUTH TRUSTSTORE PASSWORD SECRET KEY***]
              type: PKCS12
    
    authorization:
      simple:
        enabled: true
        adminUsers: [***ADMIN USERS***]
        readOnlyUsers: [***READ-ONLY USERS***]
    
    database:
      type: postgresql
      jdbcUrl: [***POSTGRESQL JDBC URL***]
      username: [***POSTGRESQL USERNAME***]
      password:
        secretKeyRef:
          name: [***POSTGRESQL PASSWORD SECRET NAME***]
          key: [***POSTGRESQL PASSWORD SECRET KEY***]
      tls:
        secretRef: [***POSTGRESQL TRUSTSTORE SECRET NAME***]
    • tls.enabled – Enables or disables TLS.

    • tls.keystore.secretKeyRef.name – The name of the Secret containing the TLS keystore.

    • tls.keystore.secretKeyRef.key – The key in the Secret specified by tls.keystore.secretKeyRef.name that contains the TLS keystore.

    • tls.keystore.password.secretKeyRef.name – The name of the Secret containing the TLS keystore password.

    • tls.keystore.password.secretKeyRef.key – The key in the Secret specified by tls.keystore.password.secretKeyRef.name that contains the TLS keystore password.

    • ingress.enabled – Enables or disables external access through Ingress.

    • ingress.tls.enabled – Enables or disables TLS for Ingress.

    • ingress.tls.secretRef – The name of the Secret containing Ingress TLS certificates.

    • ingress.extraAnnotations.* – Extra annotations to apply to the Ingress.

    • authentication.oauth.enabled – Enables OAuth authentication for the Schema Registry server.

    • authentication.oauth.jwt.principalClaimName – The name of the claim in the JWT token that contains the principal (username) used for authorization.

    • authentication.oauth.jwt.expectedAudience – The expected audience value. If the JWT token contains an aud claim, it must match this value, otherwise the token is considered invalid.

    • authentication.oauth.jwks.url – The URL to the JWKS endpoint.

    • authentication.oauth.jwks.tls.truststore.secretKeyRef.name – The name of the Secret that contains the truststore for accessing the JWKS endpoint. Configure this property if the backend of your JWKS has self-signed certificates.

    • authentication.oauth.jwks.tls.truststore.secretKeyRef.key – The key in the Secret specified by authentication.oauth.jwks.tls.truststore.secretKeyRef.name that contains the truststore for accessing the JWKS endpoint.

    • authentication.oauth.jwks.tls.truststore.password.secretKeyRef.name – The name of the Secret that contains the truststore password for accessing the JWKS endpoint.

    • authentication.oauth.jwks.tls.truststore.password.secretKeyRef.key – The key in the Secret specified by authentication.oauth.jwks.tls.truststore.password.secretKeyRef.name that contains the truststore password for accessing the JWKS endpoint.

    • authorization.simple.enabled – Enables or disables authorization.

    • authorization.simple.adminUsers – A list of admin usernames. Admin users can perform any operation in Schema Registry.

    • authorization.simple.readOnlyUsers – A list of read-only usernames. Read-only users can only perform read operations in Schema Registry.

    • database.jdbcUrl – The JDBC URL that points to your PostgreSQL database.

    • database.username – The PostgreSQL username for Schema Registry database connections.

    • database.password.secretKeyRef.name – The name of the Secret containing the PostgreSQL database password.

    • database.password.secretKeyRef.key – The key in the Secret specified by database.password.secretKeyRef.name that contains the PostgreSQL database password.

    • database.tls.secretRef – The name of a Secret containing TLS configuration for PostgreSQL connections (certificates, truststores, and so on). All keys from the Secret are mounted to /etc/schema-registry/postgres/tls. Reference mounted files in your JDBC URL (database.jdbcUrl) to configure SSL connections if SSL is required for PostgreSQL.

  9. Install Schema Registry with helm install.
    helm install schema-registry \
      --namespace [***NAMESPACE***] \
      --values [***VALUES FILE***] \
      --set 'image.imagePullSecrets=[***REGISTRY CREDENTIALS SECRET***]' \
      oci://container.repository.cloudera.com/cloudera-helm/csm-operator/schema-registry \
      --version 1.6.0-b99
    • The string schema-registry is the Helm release name of the chart installation. This is an arbitrary, user defined name. Cloudera recommends that you use a unique and easily identifiable name.

    • [***VALUES FILE***] is the values file you prepared in Step 8.

    • imagePullSecrets specifies what Secret is used to pull images from the Cloudera registry. Setting this property is mandatory, otherwise, Helm cannot pull the necessary images from the Cloudera Docker registry. Ensure that you replace [***REGISTRY CREDENTIALS SECRET***] with the name of the Secret you created in Step 3.

    • You can use --set to override properties that are defined in your values file, or add additional properties that are not present in your values file.

  10. Verify your installation.
    This is done by listing the Deployments and Pods in your namespace. If installation is successful, a Schema Registry Deployment and two Pods will be present in the cluster.
    kubectl get deployments --namespace [***NAMESPACE***]
    NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
    #...
    schema-registry   2/2     2            2           13m
    kubectl get pods --namespace [***NAMESPACE***]
    NAME                       READY   STATUS   RESTARTS   AGE
    #...
    schema-registry-858f647cfc-82mkj   1/1     Running            0           13m
    schema-registry-858f647cfc-jl4nt    1/1     Running            0           13m
Configure clients to interact with Schema Registry or review and use the REST API.

Installing Schema Registry for evaluation

Complete these steps to install a basic deployment of Schema Registry that has no security configured and uses an in-memory database. Use these instructions if you want to install quickly in a development environment for proof of concept or evaluation purposes.

  • Your Kubernetes environment meets requirements listed in System requirements.

  • Your Kubernetes cluster requires internet connectivity to complete these steps. It must be able to reach the Cloudera Docker registry.

  • You have access to your Cloudera credentials (username and password). Credentials are required to access the Cloudera Archive and Cloudera Docker registry where installation artifacts are hosted.

  • You have access to a valid Cloudera license.

  • Review the Helm chart reference before installation.

    The Helm chart accepts various configuration properties that you can set during installation. Using these properties you can customize your installation.

  1. Create a namespace in your Kubernetes cluster.
    kubectl create namespace [***NAMESPACE***]
    This is the namespace where you install Schema Registry. Use the namespace you create in all installation steps that follow.
  2. Log in to the Cloudera Docker registry with helm.
    helm registry login container.repository.cloudera.com

    Enter your Cloudera credentials when prompted.

  3. Create a Kubernetes Secret containing your Cloudera credentials.
    kubectl create secret docker-registry [***REGISTRY CREDENTIALS SECRET***] \
      --namespace [***NAMESPACE***] \
      --docker-server container.repository.cloudera.com \
      --docker-username [***USERNAME***] \
      --docker-password "$(echo -n 'Enter your Cloudera password: ' >&2; read -s password; echo >&2; echo $password)"
    • Take note of the name you specify as [***REGISTRY CREDENTIALS SECRET***]. You will need to specify the name in a later step.

    • Replace [***USERNAME***] with your Cloudera username.

    • Enter your Cloudera password when prompted.

  4. Prepare a custom values file (values.yaml).
    The following example configures an unsecure deployment with an in-memory database.
    tls:
      enabled: false
    
    authentication:
      oauth:
        enabled: false
    
    authorization:
      simple:
        enabled: false
    
    database:
      type: in-memory
    
    service:
      type: NodePort
    
    • All security-related properties are set false to disable security. These properties must be explicitly set to false as the default value for all of them is true.

    • database.type – The type of database to use. The in-memory option starts Schema Registry with an ephemeral in-memory database that requires no additional configuration. However, in-memory mode is only suitable for testing and evaluation as all schemas will be lost when Pods restart.

    • service.type – The type of Kubernetes Service used for exposing the Schema Registry application. In this example NodePort is used instead of the default ClusterIP, so that Schema Registry is made accessible from outside the Kubernetes cluster.

  5. Install Schema Registry with helm install.
    helm install schema-registry \
      --namespace [***NAMESPACE***] \
      --values [***VALUES FILE***] \
      --set 'image.imagePullSecrets=[***REGISTRY CREDENTIALS SECRET***]' \
      oci://container.repository.cloudera.com/cloudera-helm/csm-operator/schema-registry \
      --version 1.6.0-b99
    • The string schema-registry is the Helm release name of the chart installation. This is an arbitrary, user defined name. Cloudera recommends that you use a unique and easily identifiable name.

    • [***VALUES FILE***] is the values file you prepared in Step 4.

    • imagePullSecrets specifies what Secret is used to pull images from the Cloudera registry. Setting this property is mandatory, otherwise, Helm cannot pull the necessary images from the Cloudera Docker registry. Ensure that you replace [***REGISTRY CREDENTIALS SECRET***] with the name of the Secret you created in Step 3.

    • You can use --set to override properties that are defined in your values file, or add additional properties that are not present in your values file.

  6. Verify your installation.
    This is done by listing the Deployments and Pods in your namespace. If installation is successful, a Schema Registry Deployment and two Pods will be present in the cluster.
    kubectl get deployments --namespace [***NAMESPACE***]
    NAME                       READY   UP-TO-DATE   AVAILABLE   AGE
    #...
    schema-registry   2/2     2            2           13m
    kubectl get pods --namespace [***NAMESPACE***]
    NAME                       READY   STATUS   RESTARTS   AGE
    #...
    schema-registry-858f647cfc-82mkj   1/1     Running            0           13m
    schema-registry-858f647cfc-jl4nt   1/1     Running            0           13m
  7. Access the Schema Registry UI.

    Installing with service.type: NodePort deploys a NodePort type Service for Schema Registry making it accessible from any of the Kubernetes cluster nodes on the external port of the Service. List Services to get the external port.

    kubectl get service schema-registry-service --namespace [***NAMESPACE***]
    NAME                        TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
    schema-registry-service   NodePort   10.43.121.112   <none>        9090:31578/TCP   13m
    
    In this example, the external port is 31578.
Configure clients to interact with Schema Registry or review and use the REST API.