Catalog Service High Availability

By default, the Impala Virtual Warehouse runs with Catalog service in Active-Passive HA mode which runs two Impala catalog instances. If you enable HA mode for Impala Virtual Warehouse, the statestore allows one of the catalog instances to become active, and the other paired catalog instance to become a standby.

With any new query requests, the Impala coordinator sends metadata requests to the catalog service and sends metadata updates to the catalog which in turn propagates metadata updates to HMS. With a pair of primary/standby Catalog instances, the standby instance will be promoted as the primary instance to continue executing queries when the primary instance goes down. This High Availability (HA) mode of catalog service in CDW reduces the outage duration of the Impala cluster when the primary catalog service fails.

Catalog Failure Detection

The Statestore instance continuously sends a heartbeat to its registered clients, including the primary and standby Catalog instances, to determine if they are healthy. If the Statestore finds the primary Catalog instance is not healthy, but the standby Catalog instance is healthy, the Statestore promotes the standby Catalog instance as the primary instance and notifies all coordinators and catalogs of this change. Coordinators will switch over to the new primary Catalog instance.