ReadyFlow: S3 to CDW

You can use the S3 to CDW Readyflow to consume CSV files from a source S3 location and write them as Parquet files to a destination S3 location and CDW Impala table.

This ReadyFlow consumes CSV files from a source S3 location, parses the schema by looking up the schema name in the CDP Schema Registry, converts the files into Parquet and writes them to a destination S3 location and CDW Impala table. You can specify the source S3 location, the target S3 location and the destination Impala table name. The ReadyFlow polls the source bucket for new files (it performs a listing periodically). Define KPIs on the failure_WriteS3 and failure_CreateCDWImpalaTable connections to monitor failed write operations.

S3 to CDW ReadyFlow details
Source CDP Managed Amazon S3
Source Format CSV
Destination CDP Managed Amazon S3 and CDW Impala
Destination Format Parquet