HWC supported types mapping
To create HWC API apps, you must know how Hive Warehouse Connector maps Apache Hive types to Apache Spark types, and vice versa. Awareness of a few unsupported types helps you avoid problems.
Spark-Hive supported types mapping
The following types are supported by the HiveWareHouseConnector library:
* StringType (Spark) and String, Varchar (Hive)
A Hive String or Varchar column is converted to a Spark StringType column. When a Spark StringType column has maxLength metadata, it is converted to a Hive Varchar column; otherwise, it is converted to a Hive String column.
** Timestamp (Hive)
The Hive Timestamp column loses submicrosecond precision when converted to a Spark TimestampType column because a Spark TimestampType column has microsecond precision, while a Hive Timestamp column has nanosecond precision.
Hive timestamps are interpreted as
UTC. When reading data from Hive,
timestamps are adjusted according to the local timezone of the Spark session. For example,
if Spark is running in the
America/New_York timezone, a Hive timestamp
2018-06-21 09:00:00 is imported into Spark as
05:00:00 due to the 4-hour time difference between
Spark-Hive unsupported types
|Timestamp With Timezone