4. Deploy Oozie with an HA Cluster
You can configure multiple Oozie servers against the same database to provide High Availability (HA) for the Oozie service. You need the following prerequisites:
A database that supports multiple concurrent connections. In order to have full HA, the database should also have HA support, or it becomes a single point of failure.
Note The default derby database does not support this.
A ZooKeeper ensemble. Apache ZooKeeper is a distributed, open-source coordination service for distributed applications; the Oozie servers use it for coordinating access to the database and communicating with each other. In order to have full HA, there should be at least 3 ZooKeeper servers. Find more information about ZooKeeper here.
Multiple Oozie servers.
Important While not strictly required, you should configure all ZooKeeper servers to have identical properties.
A Loadbalancer, Virtual IP, or Round-Robin DNS. This is used to provide a single entry-point for users and for callbacks from the JobTracker. The load balancer should be configured for round-robin between the Oozie servers to distribute the requests. Users (using either the Oozie client, a web browser, or the REST API) should connect through the load balancer. In order to have full HA, the load balancer should also have HA support, or it becomes a single point of failure. For information about how to set up your Oozie servers to handle failover, see Configuring Oozie Failover.