Direct Reader limitations

You must understand the limitations of Direct Reader mode and what functionality is not supported.

Limitations

  • You cannot write data using HWC Direct Reader.
  • Transaction semantics of Spark RDDs are not ensured when using Spark Direct Reader to read ACID tables.
  • Supports only single-table transaction consistency. The direct reader does not guarantee that multiple tables referenced in a query read the same snapshot of data.
  • Does not auto-commit transactions submitted by rdd APIs. Explicitly close transactions to release locks.
  • Requires read and execute access on the hive-managed table locations.
  • Does not support Ranger authorization.

    You must configure read access to the HDFS, or other, location for managed tables. You must have Read and Execute permissions on hive warehouse location (hive.metastore.warehouse.dir).

  • Blocks compaction on open read transactions.
The way Spark handles null and empty strings can cause a discrepancy between metadata and actual data when writing the data read by Spark Direct Reader to a CSV file.

Unsupported functionality

Spark Direct Reader does not support the following functionality:
  • Writes
  • Streaming inserts
  • CTAS statements