Using ReadyFlows

Using a ReadyFlow to build your data flow allows you to get started with CDF quickly and easily. A ReadyFlow is a flow definition template optimized to work with a specific CDP source and destination. So instead of spending your time on building the data flow in NiFi, you can focus on deploying your flow and defining the right KPIs for easy monitoring.

The ReadyFlow Gallery is where you can find out-of-box flow definitions. To use a ReadyFlow, add it to the Catalog and then use it to create a Flow Deployment.

Kafka to S3 Avro

Kafka to S3 Avro ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic and merges the events into Avro files before writing the data to S3. The flow writes out a file every time its size has either reached 100 MB or five minutes have passed.

Ingesting Data using the Kafka to S3 Avro ReadyFlow

You can use the Kafka to S3 Avro ReadyFlow to move your data into an AWS bucket.

Kafka Filter to Kafka

Kafka Filter to Kafka ReadyFlow Summary

This ReadyFlow consumes JSON, CSV, or Avro data from a source Kafka topic and parses the schema by looking up the schema name in the CDP Schema Registry. You can filter events by specifying a SQL query in the Filter Rule parameter.

Ingesting Data using the Kafka Filter to Kafka ReadyFlow

You can use the Kafka Filter to Kafka ReadyFlow to move your data from a Kafka topic to another Kafka topic while applying a schema to the data in Cloudera DataFlow (CDF).

Kafka to Kafka

Kafka to Kafka ReadyFlow Summary

This ReadyFlow consumes JSON, CSV, or Avro data from a source Kafka topic and parses the schema by looking up the schema name in the CDP Schema Registry.

Ingesting Data using the Kafka to Kafka ReadyFlow

You can use the Kafka to Kafka ReadyFlow to move your data from a Kafka topic to another Kafka topic.

Kafka to ADLS Avro

Kafka to ADLS Avro ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic and merges the events into Avro files before writing the data to ADLS. The flow writes out a file every time its size has either reached 100 MB or five minutes have passed.

Ingesting Data using the Kafka to ADLS Avro ReadyFlow

You can use the Kafka to ADLS Avro ReadyFlow to move your data into an ADLS container.

Kafka to Cloudera Operational Database

Kafka to Cloudera Operational Database ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic, parses the schema by looking up the schema name in the CDP Schema Registry and ingests it into an HBase table in COD.

Ingesting Data using the Kafka to Cloudera Operational Database ReadyFlow

You can use the Kafka to Cloudera Operational Database ReadyFlow to move data into Cloudera Operational Database through Cloudera DataFlow.

Kafka to Kudu

Kafka to Kudu ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic, parses the schema by looking up the schema name in the CDP Schema Registry and ingests it into a Kudu table.

Ingesting Data using the Kafka to Kudu ReadyFlow

You can use the Kafka to Kudu ReadyFlow to move your data from a Kafka topic into Apache Kudu in a CDP Public Cloud Real-time Data Mart cluster.

S3 to S3 Avro

S3 to S3 Avro ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source S3 bucket and transforms the data into Avro files before writing it to another S3 bucket.

Ingesting Data using the S3 to S3 Avro ReadyFlow

You can use the S3 to S3 Avro ReadyFlow to move your data between two AWS buckets, while transforming it to Avro format.

S3 to S3 Avro with S3 Notifications

S3 to S3 Avro with S3 Notifications ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source S3 bucket and transforms the data into Avro files before writing it to another S3 bucket. The ReadyFLow is configured with notifications about new files that arrive in the sourcce AWS bucket.

Ingesting Data using the S3 to S3 Avro with S3 Notifications ReadyFlow

You can use the S3 to S3 Avro ReadyFlow to move your data between two AWS buckets, while transforming it to Avro format. The ReadyFLow is configured with notifications about new files that arrive in the sourcce AWS bucket.

Non-CDP S3 to CDP S3

Non-CDP S3 to CDP S3 ReadyFlow Summary

This ReadyFlow moves data between non-CDP managed source and CDP-managed destination S3 locations.

Ingesting Data using the Non-CDP S3 to CDP S3 Avro ReadyFlow

You can use Cloudera DataFlow and the Non-CDP S3 to CDP S3 ReadyFlow to move your data between an external source S3 and a CDP-managed destination S3 location.

Non-CDP ADLS to CDP ADLS

Non-CDP ADLS to CDP ADLS ReadyFlow Summary

This ReadyFlow moves data between non-CDP managed source and CDP-managed destination ADLS locations.

Ingesting Data using the Non-CDP ADLS to CDP ADLS Avro ReadyFlow

You can use Cloudera DataFlow and the Non-CDP ADLS to CDP ADLS ReadyFlow to move your data between an external source ADLS and a CDP-managed destination ADLS.

ADLS to ADLS Avro

ADLS to ADLS Avro ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source ADLS container and transforms the data into Avro files before writing it to another ADLS container.

Ingesting Data using the ADLS to ADLS Avro ReadyFlow

You can use the ADLS to ADLS Avro ReadyFlow to move your data between two ADLS containers, while transforming it to Avro format.

Confluent Cloud to S3/ADLS

Confluent Cloud to S3/ADLS ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic in Confluent Cloud and parses the schema by looking up the schema name in the Confluent Schema Registry. The filtered events are then written to the destination S3 or ADLS location.

Ingesting Data using the Confluent Cloud to S3/ADLS ReadyFlow

You can use the Confluent Cloud to S3/ADLS ReadyFlow to move your data between Confluent Cloud and Amazon S3 or Azure ADLS, while filteering it using a SQL query.

Confluent Cloud to Snowflake

Confluent Cloud to Snowflake ReadyFlow Summary

This ReadyFlow consumes JSON, CSV or Avro data from a source Kafka topic in Confluent Cloud and filters records based on a user-provided SQL query before writing it to a Snowflake table.

Ingesting Data using the Confluent Cloud to Snowflake ReadyFlow

You can use the Confluent Cloud to Snowflake ReadyFlow to move your data between Confluent Cloud and Snowflake, while transforming it to Snowflake table format.

JDBC to S3/ADLS

JDBC to S3/ADLS ReadyFlow Summary

This ReadyFlow consumes data from a source database table and filters events based on a user-provided SQL query before writing it to a destination Amazon S3 or Azure Data Lake Storage (ADLS) location in the specified output data format.

Ingesting Data using the JDBC to S3/ADLS ReadyFlow

You can use the JDBC to S3/ADLS ReadyFlow to move data from a source database table to a destination Amazon S3 or Azure Data Lake Storage (ADLS) location while filtering the events and converting to the specified output data format.

Azure Event Hub to ADLS

Azure Event Hub to ADLS ReadyFlow Summary

You can use the Azure Event Hub to ADLS ReadyFlow to ingest JSON, CSV or Avro files from an Azure Event Hub namespace, optionally parsing the schema using CDP Schema Registry or direct schema input. The flow then filters records based on a user-provided SQL query and writes them to a target Azure Data Lake Storage (ADLS) location in the specified output data format.

Ingesting Data using the Azure Event Hub to ADLS ReadyFlow

You can use the Azure Event Hub to ADLS ReadyFlow to move data from a source Azure Event Hub namespace to a destination Azure Data Lake Storage (ADLS) location while filtering the events and converting to the specified output data format.

ListenHTTP to CDP Kafka ReadyFlow Summary

You can use the ListenHTTP to CDP Kafka ReadyFlow to listen to JSON, CSV or Avro events on a specified port and write them to CDP Kafka.

Ingesting Data using the ListenHTTP filter to Kafka ReadyFlow

You can use the This ReadyFlow listens to a JSON, CSV or Avro data stream on a specified port and parses the schema by looking up the schema name in the CDP Schema Registry. You can filter events by specifying a SQL query. The filtered events are then converted to the specified output data format and written to the destination CDP Kafka topic.