Using PostgreSQL CDC connector

The Community Edition comes pre-configured with a PostgreSQL database that can be used to try Change Data Capture (CDC). After setting up a table in the PostgreSQL database, you can use the connector for your SQL queries.

Learning goal

  • How to add a PostgreSQL CDC
  • How to use a CDC connector

Learning path

A database named ssb_cdc has been created with the necessary configuration. An ssb_cdc role (password: cloudera) has also been added that can be used to access the changelog streams.

Before using the PostgreSQL connector in, you need to up a CDC table in the PostgreSQL database.

As an example, create the following table in PostgreSQL:
CREATE TABLE cdc_table (a varchar, b int);
Set the permissions to the created table:
ALTER TABLE cdc_table REPLICA IDENTITY FULL;

After setting up the database, you can use it in Streaming SQL Console.

To subscribe to the changelog stream in SSB, create a table and check the data in the database using the following example:
CREATE TABLE `ssb`.`ssb_default`.`cdc_postgres_table` (
  `a` VARCHAR(2147483647),
  `b` INT,
  `ts` AS PROCTIME()
) WITH (
  'connector' = 'postgres-cdc',
  'hostname' = 'postgresql',
  'database-name' = 'ssb_cdc',
  'schema-name' = 'public',
  'table-name' = 'cdc_table',
  'username' = 'ssb_cdc',
  'password' = 'cloudera',
  'decoding.plugin.name' = 'pgoutput',
  'debezium.slot.name' = 'flink'
);

SELECT * from cdc_postgres_table;

The data stored in the PostgreSQL database is shown under the Results tab.

Next step

You can use the created PostgreSQL CDC table in your queries.