Configuring the query results cache
You need to understand the query results cache to enable or disable it for debugging and to configure the memory allocated for the results.
Configuring the query results cache
Some operations support hundreds of thousands of users who connect to Hive using BI systems such as Tableau. In these situations, repetitious queries are inevitable. The query results cache, which is on by default, filters and stores common queries in a cache. When you issue the query again, Hive retrieves the query results from a cache instead of recomputing the result, which takes a load off the backend system.
Every query that runs in Hive 3 stores its result in a cache. Hive evicts invalid data from the cache if the input table changes. For example, if you preform aggregation and the base table changes, queries you run most frequently stay in cache, but stale queries are evicted. The query results cache works with managed tables only because Hive cannot track changes to an external table. If you join external and managed tables, Hive falls back to executing the full query. The query results cache works with ACID tables. If you update an ACID table, Hive reruns the query automatically.
You can enable and disable the query results cache from command line. You might want to do so
to debug a query. You disable
the query results cache in HiveServer by setting the following parameter to false:
SET hive.query.results.cache.enabled=false;
By default, Hive allocates 2GB for the query results cache. You can change this setting by
configuring the following parameter in bytes:
hive.query.results.cache.max.size