Connectors

Learn what Kafka Connect connectors are shipped with Cloudera Runtime.

Cloudera Runtime comes prepackaged with a number of Cloudera developed Kafka Connect connectors. In addition, connectors that come packaged with the version of Apache Kafka that is included in Cloudera Runtime are also available for use. Manually installing and using your own custom connectors is also possible. The following collects the connectors shipped in Cloudera Runtime.

Debezium connectors

Debezium connectors capture changes from a wide variety of databases and produce the captured changes into Kafka. Using Debezium connectors makes it possible for your applications to consume and respond to change events regardless of where the changes originated from.

Table 1. Debezium connectors
Connector Description
Debezium MySQL Source The Debezium MySQL Source connector reads the binary log (binlog) of a MySQL server, produces change events for row-level INSERT, UPDATE, and DELETE operations, and transfers the changes to Kafka topics.
Debezium Oracle Source The Debezium Oracle Source connector captures and records row-level changes that occur in databases on an Oracle server, including tables that are added while the connector is running, and transfers the changes to Kafka topics.
Debezium PostgreSQL Source The Debezium PostgreSQL Source connector captures row-level INSERT, UPDATE, and DELETE operations, produces change evens for each change, and transfers the changes to Kafka topics.
Debezium SQL Server Source The Debezium SQL Server Source connector captures row-level INSERT, UPDATE, and DELETE operations that occur in the schemas of a SQL Server database, produces change events for each change, and transfer the changes to Kafka topics.

Stateless NiFi connectors

The Stateless NiFi Source and Sink connectors allow you to run NiFi dataflows within Kafka Connect. Using these connectors can grant you access to a number of NiFi features without having the need to deploy or maintain NiFi on your cluster.

Stateless NiFi connectors fall in to two categories. You have the base Stateless NiFi Source (StatelessNiFiSource) and Stateless NiFi Sink (StatelessNiFiSink) connectors. In addition, there are a number of ready-to-use connectors based on Stateless NiFi Source and Sink. These Stateless NiFi-based connectors run predefined dataflows developed by Cloudera and cover common data movement use cases.

Table 2. Base connectors
Connector Description
Stateless NiFi Source The Stateless NiFi Source and Sink connectors allow you to run NiFi dataflows within Kafka Connect. Using these connectors can grant you access to a number of NiFi features without having the need to deploy or maintain NiFi on your cluster.
Stateless NiFi Sink
Table 3. Predefined flows/Stateless NiFi-based connectors
Connector Description
HTTP Source The HTTP Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The HTTP Source connector listens on a port for HTTP POST requests and transfers the request body to a Kafka topic.
JDBC Source The JDBC Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The JDBC Source connector reads records from a database table and transfers each record to Kafka in Avro or JSON format.
JMS Source The JMS Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The JMS Source Connector consumes messages from a JMS broker and transfers the message body to Kafka.
MQTT Source The MQTT Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The MQTT Source connector consumes messages from an MQTT broker and transfers them to Kafka.
SFTP Source The SFTP Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The SFTP Source connector obtains files from an SFTP server and transfers them to Kafka.
Syslog TCP Source The Syslog TCP Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The Syslog TCP Source connector listens on a port for syslog messages over TCP and transfers them to Kafka.
Syslog UDP Source The Syslog UDP Source connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The Syslog UDP Source connector listens on a port for syslog messages over UDP and transfers them to Kafka.
ADLS Sink The ADLS Sink connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka connect framework. The ADLS Sink connector fetches messages from Kafka and uploads them to ADLS.
HTTP Sink The HTTP Sink connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The HTTP Sink connector obtains messages from a Kafka topic and transfers their content in a HTTP POST requests to a specified endpoint.
JDBC Sink The JDBC Sink connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The JDBC Sink connector fetches messages from Kafka and loads them into a database table.
Kudu Sink The Kudu Sink connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The Kudu Sink connector fetches messages from Kafka and loads them into a table in Kudu.
S3 Sink The S3 Sink Connector is a Stateless NiFi dataflow developed by Cloudera that is running in the Kafka Connect framework. The S3 Sink connector fetches messages from Kafka and uploads them to AWS S3.

Standard connectors

The following are either Cloudera developed connectors or connectors that come packaged with Apache Kafka.

Table 4. Standard connectors
Connector Description
Amazon S3 Sink

The Amazon S3 Sink connector is Cloudera developed connector that consumes data from Kafka topics and streams the data to an S3 bucket.

HDFS Sink The HDFS Sink connector is a Cloudera developed connector that transfer data from Kafka topics to files on HDFS clusters.
MirrorSourceConnector The MirrorSourceConnector is a connector used internally by Streams Replication Manager (SRM). Within SRM, this connector is responsible for replicating topics between the source and target cluster. Standalone use of this connector is not recommended by Cloudera.
MirrorHeartbeatConnector The MirrorHeartbeatConnector is a connector used internally by Streams Replication Manager (SRM). Within SRM, this connector is responsible for creating the heartbeats topic in the target cluster. It also periodically produces heartbeats into the heartbeats topic. Standalone use of this connector is not recommended by Cloudera.
MirrorCheckpointConnector The MirrorCheckpointConnector is a connector used internally by Streams Replication Manager (SRM). Within SRM, this connector is responsible for replicating the committed group offsets between the source and target cluster. In addition, the connector is also capable of periodically applying the offsets to the consumer groups in the target cluster. Standalone use of this connector is not recommended by Cloudera.

Example connectors (non-production)

The FileStream Source and Sink connectors are example connectors packaged with Apache Kafka. Both FileStream connectors are meant to be used to demonstrate the capabilities of Kafka Connect and are not production ready.

Table 5. Non-production example connectors
Connector Description
FileStream Sink The FileStream Sink connector reads data from Kafka and transfers that data to a local file.
FileStream Source The FileStream Source connector reads data from a file and transfers the data to Kafka.