Integrating Apache Hive with Apache Spark and BIPDF version

JDBC mode limitations

You must understand the limitations of JDBC mode and what functionality is not supported.

Keep the following limitations of JDBC mode in mind:
  • JDBC_CLUSTER and JDBC_CLIENT are used for reads only, and are recommended for production workloads of 1 GB or less. With larger workload bottlenecks develop in data transfer to Spark.

    Writes through HWC of any size are recommended for production. Writes do not use JDBC mode.

  • In JDBC_CLUSTER mode, HWC fails to correctly resolve queries that use the ORDER BY clause when run as hive.sql(" <query> "). The query returns unordered rows of data even though the query contains an ORDER BY clause.
  • In JDBC read mode, a query of a table having a column of a complex type, such as ARRAY, STRUCT and MAP, incorrectly represents the type as String in the returned DataFrame.