Configuring Impala for High Availability
You can learn how to ensure high availability in an Impala cluster by deploying pairs of StateStore and Catalog instances in primary/standby mode. This setup reduces single points of failure and minimizes outage durations by allowing standby instances to take over when primary instances fail.
How configuring High Availability helps
The Impala StateStore checks on the health of all Impala daemons in a cluster, and continuously relays its findings to each of the daemons. The Catalog stores metadata of databases, tables, partitions, resource usage information, configuration settings, and other objects managed by Impala. If StateStore and Catalog daemons are single instances in an Impala cluster, it creates a single point of failure. Although Impala coordinators/executors continue to execute queries if the StateStore node is down, coordinators/executors will not get state updates. This causes degradation of admission control and cluster membership updates. To mitigate this, a pair of StateStore and Catalog instances can be deployed in an Impala cluster so that Impala cluster can survive failures of StateStore or Catalog.
How High Availability works in Impala cluster
With a pair of StateStore instances in primary/standby mode, the primary StateStore instance will send the cluster's state update and propagate metadata updates. It periodically sends heartbeat to the standby StateStore instance, Catalog, coordinators and executors. The standby StateStore instance also sends heartbeats to the Catalog, and coordinators and executors. RPC connections between daemons and StateStore instances are kept alive so that broken connections usually do not result in false failure reports between nodes. The standby StateStore instance takes over the primary role when the service is needed in order to continue to operate when the primary instance goes down.
With any new query requests, the Impala coordinator sends metadata requests to catalog service and sends metadata updates to catalog which in turn propagates metadata updates to hive metastore. With a pair of primary/standby Catalog instances, the standby catalog instance is promoted as the primary instance to continue Catalog service for Impala cluster when the primary instance goes down. The active catalogd acts as the source of metadata and provides Catalog service for the Impala cluster. This High Availability (HA) mode of Catalog service reduces the outage duration of the Impala cluster when the primary catalog instance fails. To support Catalog HA, you can now add two catalogd instances in an Active-Passive HA pair to an Impala cluster by choosing the High Availability option.