ReadyFlow: Kafka filter to Kafka
You can use the Kafka filter to Kafka ReadyFlow to move your data between two Kafka topics, while applying a schema to the data in Cloudera DataFlow.
Kafka filter to Kafka ReadyFlow description
This ReadyFlow consumes JSON, CSV, or Avro data from a source Kafka topic and parses the schema by looking up the schema name in the Cloudera Schema Registry. You can filter events by specifying a SQL query in the Filter Rule parameter. The filtered events are then converted to the specified output data format and written to the destination Kafka topic. Failed Kafka write operations are retried automatically to handle transient issues. Define a KPI on the failure_WriteToKafka connection to monitor failed write operations.
Kafka filter to Kafka ReadyFlow details
ReadyFlow details | |
---|---|
Source | Kafka Topic |
Source Format | JSON, CSV, or Avro |
Destination | Kafka Topic |
Destination Format | JSON, CSV, or Avro |
Moving data with a Kafka filter to Kafka flow
You can use a Kafka filter to Kafka data flow when you want to filter specific events from a Kafka topic and write the filtered stream to another Kafka topic. For example, you could use this flow to filter out erroneous records that contain "null" for an important field. In order to filter the data using a SQL query, you need to provide a schema for the events that you are processing in the topic.
Your data flow can consume JSON, CSV, or Avro data from the source Kafka topic and write to the destination Kafka topic in any of these formats. The data flow parses the schema by looking up the schema name in the Cloudera Schema Registry. You can filter the events by specifying a SQL query. The filtered events are converted to the specified output data format, and they are written to the destination Kafka topic.