Known Issues for Apache Sqoop

This topic describes known issues and workarounds for using Parquet and Avro imports in this release of Cloudera Runtime.

Avro, S3, and HCat do not work together properly
Problem: Importing an Avro file into S3 with HCat fails with Delegation Token not available.
CDPD-3089
Parquet columns inadvertently renamed
Problem: Column names that start with a number are renamed when you use the --as-parquetfile option to import data.
Workaround: Prepend column names in Parquet tables with one or more letters or underscore characters.
Apache JIRA: None
Importing Parquet files might cause out-of-memory (OOM) errors
Problem: Importing multiple megabytes per row before initial-page-run check (ColumnWriter) can cause OOM. Also, rows that vary significantly by size so that the next-page-size check is based on small rows, and is set very high, followed by many large rows can also cause OOM.
PARQUET-99