Fixed Issues in Apache Sqoop
Review the list of Sqoop issues that are resolved in Cloudera Runtime 7.2.18.
- CDPD-44397: Implement ORC support in Sqoop-Connector-Teradata component
- A new version of Cloudera Connector Powered by Teradata version 1.8.5.1c7 is released which includes ORC support in the Sqoop-Connector-Teradata component. You can use Teradata Manager to import data from the Teradata server to Hive in ORC format.
- CDPD-47175: Sqoop Hive import with ORC file fails with ClassCastException
- The import process of Sqoop to ORC file has been updated.
Whenever an unsupported conversion is attempted, Sqoop now provides a comprehensive
error message describing the issue.
Sqoop can now import the following data types:
- Byte, Short, Int, Long, Float, Double from the same RDBMS types
- BigDecimal to Long, Double, String
- Date, Timestamp to String, Date, Timestamp
- CDPD-56523: Sqoop does not take
--hive-compute-stats
option into account forhs2-url
Hive imports - Sqoop now considers the
--hive-compute-stats
option for Hive imports whenhs2-url
parameter is used. - CDPD-58538: Oozie should upload and use the config files from sqoop-conf/managers.d when available
- Previously, Oozie did not honor Sqoop's managers.d configurations and extra connector JARs from the lib folder, but now both are automatically available in Oozie's Sqoop action, allowing users to seamlessly utilize connectors like the Sqoop Teradata connector without the need for manual configuration updates or copying JARs to the Workflow's lib folder.
- CDPD-59557: Secure options to provide the Hive password for Sqoop Hive imports
- This fix introduces secure options that you can use to provide the Hive password during Sqoop-Hive imports instead of the earlier way of providing the password as plaintext in the command-line interface.
- CDPD-59710: Fix time stamp conversion issue when exporting Parquet
- When available, Sqoop will incorporate the writer's time zone metadata from the Parquet file during the export operation.
- CDPD-61547: Sqoop should not close 'System.out' and 'System.err'
- In certain cases the Sqoop process closed the 'sysout' and 'syserr' streams making it impossible to write to these if Sqoop manually used in a custom JVM.
- CDPD-63723: Sqoop should determine files as Parquet by PAR1 in header
- Sqoop now looks at the first 4 bytes of a file instead of 3 bytes to determine if the file is a Parquet file or not
- CDPD-63915: Sqoop Teradata export fails if the source table is empty
- Fixed the issue where Sqoop Teradata export failed if the source table was empty
Apache patch information
None