Fixed Issues in Apache Sqoop

Review the list of Sqoop issues that are resolved in Cloudera Runtime 7.1.9.

CDPD-44397: Implement ORC support in Sqoop-Connector-Teradata component
A new version of Cloudera Connector Powered by Teradata version 1.8.5.1c7 is released which includes ORC support in the Sqoop-Connector-Teradata component. You can use Teradata Manager to import data from the Teradata server to Hive in ORC format.
CDPD-44431: Disable the Sqoop direct mode feature with ability to enable it again temporarily
Sqoop's direct mode is no longer supported and is disabled by default. However, you can still enable it by either setting the sqoop.enable.deprecated.direct property globally in Cloudera Manager for Sqoop or by specifying it in the command-line through -Dsqoop.enable.deprecated.direct=true.
CDPD-44531: Sqoop cannot export Parquet data due to ClassCastException
Sqoop can now export the following data types from Avro and Parquet files:
  • Int, Float, Double to the same RDBMS types
  • Long to BigDecimal, Date, Time, TimeStamp
  • Bytes to BigDecimal
  • Fixed to Decimal and TimeStamp

    Note that Fixed to TimeStamp does not work if the source date is based on the Julian calendar.

CDPD-47175: Sqoop Hive import with ORC file fails with ClassCastException
The import process of Sqoop to ORC file has been updated. Whenever an unsupported conversion is attempted, Sqoop now provides a comprehensive error message describing the issue.
Sqoop can now import the following data types:
  • Byte, Short, Int, Long, Float, Double from the same RDBMS types
  • BigDecimal to Long, Double, String
  • Date, Timestamp to String, Date, Timestamp
CDPD-50423: Sqoop ClassCastExceptions when exporting from Parquet
Sqoop has been enhanced to support additional data type mappings when exporting from Parquet.
CDPD-56523: Sqoop does not take --hive-compute-stats option into account for hs2-url Hive imports
Sqoop now considers the --hive-compute-stats option for Hive imports when hs2-url parameter is used.