Cloudera SQL Stream Builder Resource ManagementPDF version

Security configurations

Configuring Flink for securing your workloads and the Cloudera SQL Stream Builder.

Cloudera SQL Stream Builder enables you to deploy the Flink SQL jobs in an easier, however you need to ensure the proper security of your workloads and Cloudera SQL Stream Builder. This means that the following tools are available for you to secure your FlinkDeployment in your Kubernetes cluster setup.

When deploying Cloudera SQL Stream Builder, you must specify a fernetKey, which will be used for encrypting the job definition for all Flink jobs started with Cloudera SQL Stream Builder. The job definitions may include sensitive data, such as the DDL of tables that can contain username and password as part of the connector configuration or authentication information for connected storages. Sensitive information will be encrypted by the Fernet key.

The specified Fernet key is created as a Kubernetes Secret in the same namespace where the Cloudera Streaming Analytics - Kubernetes Operator is installed, and will be automatically mounted by Cloudera SQL Stream Builder and Flink pods.

When installing Cloudera SQL Stream Builder, a default user (admin/admin) is created automatically, and registration is enabled. You have the option to enable or disable user registration, and you can also modify the default user(s) based on your requirements.

By default Cloudera SQL Stream Builder does not set up any Ingress. This can be changed using the ingress configuration. The Ingress resource is created in the same namespace as Cloudera SQL Stream Builder.

Ingress can be used to easily enable TLS/HTTPs to Cloudera SQL Stream Builder, but it can also be used to set up authentication. For more information see Ingress.

It is recommended to add some kind of persistent data storage for Flink to be able to save checkpoints and savepoints. In most cases this is some kind of blob storage (for example, S3) that needs authentication to access.

You can use the storageConfiguration configuration to set up the storage for Cloudera SQL Stream Builder. The configuration should be a valid flink-conf.yaml file, which can contain sensitive data, such as s3.access-key, s3.secret-key, and so on.

The Helm chart creates a Secret in the same namespace as Cloudera SQL Stream Builder, and Cloudera SQL Stream Builder creates a new Secret for each new Flink deployment created by the user.

It is possible to mount existing volumes to Cloudera SQL Stream Builder and all created Flink pods using the podVolumes and podVolumeMounts configurations. You need to ensure that these volumes exist in the namespace of the Cloudera SQL Stream Builder and Flink pods that will be created by SSB.

The configurations can be used to mount ConfigMaps, Secret, or any kind of volumes to the SSB and Flink pods. For example, you can mount hive-site.xml, core-site.xml, krb5.conf and some keytabs as ConfigMaps and Secrets to be able to connect to Hive with SSB/Flink.

To enable Kerberos authentication, you need to add the Hadoop dependencies to the Cloudera Streaming Analytics images as described in the Customize container images section.

After adding the dependencies, you need to ensure that the Hadoop configuration and krb5.conf files are added as a configmap using the following commands:

kubectl -n flink create configmap hadoop-conf --from-file core-site.xml=core-site.xml --from-file hdfs-site.xml=hdfs-site.xml
kubectl -n flink create configmap krb5-conf --from-file krb5.conf=krb5.conf

When the configmaps are in place, the following configuration properties should be updated in the values.yml file for the configuration files to be mounted on the containers:

ssb:
  podVolumes:
   create: true
   data:
   - name: hadoop-conf-volume
     configMap:
      name: hadoop-conf
   - name: krb5-conf-volume
     configMap:
      name: krb5-conf
  podVolumeMounts:
   create: true
   data:
   - name: hadoop-conf-volume
     mountPath: /etc/hadoop/conf
     readOnly: true
   - name: krb5-conf-volume
     mountPath: /etc/krb5.conf
     subPath: krb5.conf

After setting up images and configurations, you can use Streaming SQL Console to specify your keytabs in the Keytab Manager. Once the keytab is successfully validated, Kerberos will be automatically configured when a new job is deployed.