ReadyFlow overview: S3 to S3 Avro with S3 Notifications

You can use the S3 to S3 Avro with S3 Notifications ReadyFlow to move your data between Amazon S3 buckets and to configure notifications.

This ReadyFlow consumes JSON, CSV, or Avro files from a source S3 location, converts the files into Avro and then writes them to a destination S3 location. You can specify the source format, the source and target location and the schema to use for reading the source data. The ReadyFlow is notified about the new files in the source bucket by configuring S3 notifications in AWS.

ReadyFlow details
Source Amazon S3
Source Format JSON, CSV, Avro
Destination Amazon S3
Destination Format Avro

Moving data between S3 locations

You can use this ReadyFlow when you want to move data from a location in AWS S3 to another S3 location, and at the same time convert the data to Avro format. You need to specify the source format, the source and target location as well as the schema to use for handling the data. Your flow can consume JSON, CSV or Avro files from the source S3 location. It converts the files into Avro format and writes them to the destination S3 location. You define and store the data processing schema in the Schema Registry on a Cloudera Streams Messaging Data Hub cluster. The data flow parses the schema by looking up the schema name in the Schema Registry.

Setting up S3 notifications

S3 buckets can be configured to generate notifications when new data arrives. You can configure this ReadyFlow to receive S3 notifications from an SQS queue so that new files are automatically moved from source to destination directory