CDC connectors

You can use the Debezium Change Data Capture (CDC) connector to stream changes in real-time from MySQL, PostgreSQL, Oracle, Db2 and feed data to Kafka, JDBC, the Webhook sink or Materialized Views using SQL Stream Builder (SSB).

Concept of Change Data Capture

Change Data Capture (CDC) is a process to capture changes in a source system, and update the data within a downstream system or application with the changes.

The Debezium implementation offers CDC with database connectors from which real-time events are updated using Kafka and Kafka Connect. Debezium captures every row-level change in each database table of an event stream. Applications read these streams to see the change events in the same order as they occurred. The change events are routed to a Kafka topic from which Kafka Connect feeds the records to other systems and databases.

For more information about Debezium, see the official Debezium site.

CDC in Cloudera Streaming Analytics (CSA) does not require Kafka or Kafka Connect as Debezium is implemented as a library within the Flink runtime. This means that the captured changes are propagated downstream to any connector that Flink supports. CSA allows queries to be issued at change data capture time, which means filtering, grouping, joining, and so on, can be performed on the change stream as it comes from the source database.

For more information about the Flink implementation of Debezium, see the official Apache Flink documentation.

From the supported set of Debezium connectors, MySQL, PostgreSQL, Oracle and Db2 are supported in Cloudera Streaming Analytics.

Using the CDC connectors

You can access and import the templates of the CDC connectors from Streaming SQL Console:

  1. Navigate to the Streaming SQL Console.
    1. Navigate to Management Console > Environments, and select the environment where you have created your cluster.
    2. Select the Streaming Analytics cluster from the list of Data Hub clusters.
    3. Select Streaming SQL Console from the list of services.

      The Streaming SQL Console opens in a new window.

  2. Select Console from the main menu.
  3. Click Templates under the SQL window.
  4. Select one of the CDC templates you want to use.

    The template is imported to the SQL window.

  5. Provide information to the mandatory fields of the template.
  6. Click Execute.