Configuring caching for secure access mode
You can enable or disable caching for the Hive Warehouse Connector (HWC) secure access mode to have finer control over read queries and ensure that the content updated outside of a Spark session is considered during reads. Caching is enabled by default because queries that run with caching enabled tend to run faster.
spark.datasource.hive.warehouse.read.mode=secure_access
Caching is enabled by default, however, you can choose to disable caching by
setting the
spark.hadoop.secure.access.cache.disable
property
to true
either at a global-level, session-level, or at a
runtime-level when running a query. Queries that run with caching disabled tend
to run slower.
- Global-level — Specify the property in the spark-defaults.conf file.
- Session-level — Specify the property using the
--conf
option inspark-submit
:.bin/spark-submit \ --conf "spark.hadoop.secure.access.cache.disable=true"
- Runtime-level — Specify the property just before running your
queries:
scala> spark.conf.set("spark.hadoop.secure.access.cache.disable","true") scala> hive.sql("select * from anytable").show
The order of preference for configuration is as follows:
- Property passed at a run-time level
- Property passed at a Spark session-level
- Property set in the spark-defaults.conf file