Using Impala with Hue
Following are some recommended configurations that give you the best performance when you use Hue with Impala.
- Always connect to the Impala load balancer, for example HaProxy, and not to a single coordinator. This helps to avoid the hotspotting issue on a single coordinator. Hotspotting occurs when too many tasks touch the same data, potentially overloading the node or coordinator.
- Set the
querycache_rowsconfiguration property to a lower value than its default setting of
50000according to your requirements. This configuration property sets the number of initial rows of a result set that Impala caches in order to support re-fetching them.
- Set the
close_queriesconfiguration property to true. This property causes Hue to try to close an Impala query when the user leaves the editor page so it frees all Impala query resources. However, it makes query results for that user inaccessible.
session_timeout_sconfiguration properties. You can use these properties to set timeouts for query execution and for sessions in Hue, which causes queries and sessions to be cancelled when the timeout period expires. They are set in seconds.
To set these configuration properties in Cloudera Manager:
- On the Cloudera Manager home page, select .
Search for the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, and then append the following configuration information into the text box after any other configurations listed there:
[impala] querycache_rows=<number_of_rows> close_queries=true query_time_s=<number_of_seconds> session_timeout_s=<number_of_seconds>
- Click Save Changes and restart the service.