3.4. Deploy Oozie with HA Cluster

You can configure multiple Oozie servers against the same database to provide High Availability (HA) of the Oozie service. You need the following prerequisites:

  • A database that supports multiple concurrent connections. In order to have full HA, the database should also have HA support, or it becomes a single point of failure.

    [Note]Note

    The default derby database does not support this.

  • A ZooKeeper ensemble. Apache ZooKeeper is a distributed, open-source coordination service for distributed applications; the Oozie servers use it for coordinating access to the database and communicating with each other. In order to have full HA, there should be at least 3 ZooKeeper servers. Find more information about Zookeeper here.

  • Multiple Oozie servers.

    [Important]Important

    While not strictly required, you should configure all ZooKeeper servers to have identical properties.

  • A Loadbalancer, Virtual IP, or Round-Robin DNS. This is used to provide a single entry-point for users and for callbacks from the JobTracker. The load balancer should be configured for round-robin between the Oozie servers to distribute the requests. Users (using either the Oozie client, a web browser, or the REST API) should connect through the load balancer. In order to have full HA, the load balancer should also have HA support, or it becomes a single point of failure.

    For information about how to set up your Oozie servers to handle failover, see Configuring Oozie Failover.


loading table of contents...