Configuring HiveServer high availability using a load balancer

To enable high availability for multiple HiveServer (HS2) hosts, you need to know how to configure a load balancer to manage them. First, you configure the Hive Delegation Token Store, next you add HiveServer roles, and finally, you configure the load balancer.

HiveServer HA does not automatically fail and retry long-running Hive queries. If any of the HiveServer instances fail, all queries running on that instance fail and are not retried. The client application must re-submit the queries.

After you enable HiveServer2 high availability, existing Oozie jobs must be changed to reflect the HS2 address. On Kerberos-enabled clusters, you must use the load balancer's principal to connect to HS2 directly; otherwise, after you enable HiveServer2 high availability, direct connections to HiveServer2 instances fail.

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)