CDC connectors

You can use the Debezium Change Data Capture (CDC) connector to stream changes in real-time from MySQL, PostgreSQL, Oracle, Db2, SQL Server and feed data to Kafka, JDBC, the Webhook sink or Materialized Views using SQL Stream Builder (SSB).

Concept of Change Data Capture

Change Data Capture (CDC) is a process to capture changes in a source system, and update the data within a downstream system or application with the changes.

The Debezium implementation offers CDC with database connectors from which real-time events are updated using Kafka and Kafka Connect. Debezium captures every row-level change in each database table of an event stream. Applications read these streams to see the change events in the same order as they occurred. The change events are routed to a Kafka topic from which Kafka Connect feeds the records to other systems and databases.

For more information about Debezium, see the official Debezium documentation.

CDC in Cloudera Streaming Analytics (CSA) does not require Kafka or Kafka Connect as Debezium is implemented as a library within the Flink runtime. This means that the captured changes are propagated downstream to any connector that Flink supports. CSA allows queries to be issued at change data capture time, which means filtering, grouping, joining, and so on, can be performed on the change stream as it comes from the source database.

For more information about the Flink implementation of Debezium, see the official Apache Flink documentation.

From the supported set of Debezium connectors, MySQL, PostgreSQL, Oracle, Db2, and SQL Server are supported in Cloudera Streaming Analytics.

Using the CDC connectors

You can access and import the templates of the CDC connectors from Streaming SQL Console:

  1. Navigate to the Streaming SQL Console.
    1. Go to your cluster in Cloudera Manager.
    2. Click on SQL Stream Builder from the list of Services.
    3. Click on the SQLStreamBuilder Console.

      The Streaming SQL Console opens in a new window.

  2. Click Create Job or select a previous job on the Getting Started page.
  3. Click Templates at the SQL Editor.
  4. Select one of the CDC templates you want to use.

    The template is imported to the SQL window.

  5. Provide information to the mandatory fields of the template.
  6. Click Execute.