Getting started with Kafka Connect
Get started with Kafka Connect in CSP Community Edition.
Kafka Connect is a tool for streaming data between Apache Kafka and other systems in a reliable and scalable fashion. Kafka Connect makes it simple to quickly define connectors that move large collections of data into and out of Kafka. Source connectors can ingest entire databases or collect metrics from all your application servers into Kafka topics, making the data available for stream processing with low latency. Sink connectors can deliver data from Kafka topics into secondary storage and query systems or into batch systems for offline analysis.
CSP Community Edition is shipped with many different Cloudera developed and publicly available connectors that each cover a specific use case for streaming data. All of these can be deployed, managed, and monitored using Streams Messaging Manager (SMM)
The following tutorial walks you through a simple use case where data is moved from a Kafka topic into a PostgreSQL database using the JDBC Sink connector, which is one of the Cloudera developed connectors shipped with CSP Community Edition.
Before you begin
docker ps
. For
example:docker ps -a --format '{{.ID}}\t{{.Names}}' --filter "name=kafka.(\d)" --filter "name=postgres"
The Kafka container will either be called cspce-kafka-1
or
cspce_kafka_1
.
The PostgreSQL container will either be called cspce-postgresql-1
or
cspce_postgresql_1
.
Creating a database and table in PostgreSQL
In order to stream data using the JDBC Sink connector, a destination is required for that data. In this tutorial, the database and table is created using the PostgreSQL instance already deployed in CSP Community Edition.
Creating a topic and producing messages
Before you can deploy the JDBC Sink connector, you need a Kafka topic with some messages that you can use. The connector connects to this topic and streams the data from the topic into the PostgreSQL database. The topic will be created using SMM. Message production is done using the Kafka console producer.
Deploying the connector
After both the topic and database is set up, it's time to deploy the JDBC Sink connector using the SMM UI.
Monitoring and managing the connector
Once the connector is deployed, you can monitor its activity using the SMM UI.