Load balancing between Hue and Impala
Condition
"Invalid query handle error or Result have expired , rerun the query if needed". You also see either of the following errors in the
runcpserver.log
file:Invalid query handle
Invalid session id
Cause
Hue uses a TCP connection pool (10 connections) for all Thrift traffic to Impala. This means that each Impala session is not guaranteed to use the same TCP connection. Load balancers send a single TCP connection to a single Impalad, but without correct persistence, Impala sessions can be sent to the wrong backend server, causing the errors you see.
Solution
To solve this issue, you must configure your load balancer that is between Hue and Impala to use Source IP persistence. This is not the load balancer in front of Hue on port 8888/8889, this is the load balancer for Impala, defined in the Impala configuration in Cloudera Manager as Impala Daemons Load Balancer. In addition to Source IP persistence, you must also set the timeout in the load balancer for these connections to a bigger value, otherwise the load balancer can close these connections even though Hue is using them and thinks they are active. Cloudera recommends a minium of 6 hours as the timeout value. 12 hours is ideal.
Cloudera also recommends that you split the VIP configurations into 3 different ports, 21000 for impala-shell users, 21050 for JDBC users and then 21051 for Hue instances. This way you only have to configure the high timeout and Source IP persistence for the Hue port 21051.
Configure the HA Proxy as follows: