Configuring intermediate results caching
Learn about the configurations required to enable the intermediate results cache for Impala queries.
To use the intermediate results cache, you must configure the following settings. By default, these features are disabled.
In Cloudera Data Warehouse, intermediate results cache storage is shared with the data cache and scratch space. To accommodate this usage, you might need to adjust the existing quotas for these elements, such as the --data_cache startup flag.
- Login to the Cloudera web interface and navigate to the Cloudera Data Warehouse service.
- In the Cloudera Data Warehouse service, go to Virtual Warehouses in the left navigation panel.
-
In the Impala Virtual Warehouse, select the
Details option from the
drop-down
list.
- Go to Details and then to the Configurations tab.
-
Configure cache for Impala Coordinator.
-
Configure cache for Impala Executor.
- Click Apply Changes and restart Impala.
The Virtual Warehouse now stores intermediate query results in the specified local directory. Subsequent queries with matching plan fragments can retrieve data from the cache, which reduces execution time and resource consumption.
You can monitor cache hits and performance by checking the Impala Query Profile. The profile displays metrics for tuple cache hits under the relevant plan nodes.
