Deployment architecture

CSA Operator can be installed on a Kubernetes cluster using Helm. The installation of CSA Operator creates the deployment of the Apache Flink Kubernetes Operator (Flink Operator), Flink Webhook, and registers the Custom Resource Definitions (CRDs) on the Kubernetes cluster. As an extension, built on top of the Flink Operator, CSA Operator also deploys the SQL Stream Builder (SSB) engine (in Technical Preview) and its corresponding PostgreSQL database.

The Flink Operator is deployed in a designated namespace. After installation, the Flink Deployment CRD is registered, responsible for bringing up the Flink cluster. Flink deployments are controlled in one or more managed namespaces by the Flink Operator. When you deploy a new Flink deployment, the JobManager pod is created alongside with different Configmaps for the Flink Operator to function. Submitting a Flink deployment with a Flink job will deploy the required TaskManagers for the job to start. The following diagram shows the deployment architecture of the Flink Operator:

When installing the CSA Operator with Helm, the Flink Operator Webhook is also installed as a custom admission plugin, which allows dynamic admission control. Similarly to connectors, you can use it to add plugins to the Flink operator that add custom rules triggered by certain actions.

There are two types of webhooks:

  • mutating webhook: if you want to automatically configure some values on it or even force certain config values whenever a user creates a new FlinkDeployment, you can create a FlinkResourceMutator. Whenever a new FlinkDeployment is submitted, Kubernetes will call the webhook of the operator, and apply the custom mutator on the deployment.
  • validating webhook: with this type, you cannot apply any changes to the deployment, but can automatically reject the creation of the deployment by implementing custom rules via the webhook.

The Flink Operator Webhook uses the TLS protocol to communicate by default, and automatically loads/reloads the keystore file when the file changes.

SSB integration - [Technical Preview]

CSA Operator comes with seamless SQL Stream Builder (SSB) integration. SSB is built on top of the Flink Operator, offering an interactive user interface for creating streaming SQL jobs.

SSB is a comprehensive interactive user interface for creating stateful stream processing jobs using SQL. Using SQL, you can simply and easily declare expressions that filter, aggregate, route, and otherwise mutate streams of data. SSB offers a job management interface that you can use to compose and run SQL on streams, as well as to create durable data APIs for the results.

The Helm chart contains the SSB subchart, which has two deployments in the CSA Operator: sse, which provides the SSB engine and User Interface (UI), and postgres, which provides the default database for SSB to function. When submitting a SQL job using the SSB UI (Streaming SQL Console), the parsed SQL is serialized, compressed and encrypted into an environment and a Flink job is deployed. This means that, under the hood, SSB creates the same Flink Deployment as you would for a generic Flink job, with the only difference that a special Flink job is created for the SQL Runner. The SQL Runner decrypts and decompresses the parsed SQL, sets up the environment inside the Flink job, and executes it. SSB by default deploys the jobs in Session Mode. You can use the installed connectors in your SQL jobs as a source or sink with the supported data formats.