ReadyFlow: S3 to S3 Avro
You can use the S3 to S3 Avro ReadyFlow to move data between source and destination S3 locations while converting the files into Avro format.
This ReadyFlow consumes JSON, CSV or Avro files from a source S3 location, converts the files into Avro and writes them to the destination S3 location. You can specify the source format, the source and target location as well as the schema to use for reading the source data.
S3 to S3 Avro ReadyFlow details | |
---|---|
Source | Amazon S3 |
Source Format | JSON, CSV, Avro |
Destination | Amazon S3 |
Destination Format | Avro |
Moving data with an S3 to S3 Avro flow
You can use an S3 to S3 Avro data flow when you want to move data from a location in AWS S3 to another S3 location, and at the same time convert the data to Avro format. You need to specify the source format, the source and target location as well as the schema to use for handling the data. Your flow can consume JSON, CSV or Avro files from the source S3 location. It converts the files into Avro format and writes them to the destination S3 location. You define and store the data processing schema in the Schema Registry on a Cloudera Streams Messaging Data Hub cluster. The data flow parses the schema by looking up the schema name in the Schema Registry.