Installing the ZooKeeper Packages
There are two ZooKeeper server packages:
- The zookeeper base package provides the basic libraries and scripts that are necessary to run ZooKeeper servers and clients. The documentation is also included in this package.
- The zookeeper-server package contains the init.d scripts necessary to run ZooKeeper as a daemon process. Because zookeeper-server depends on zookeeper, installing the server package automatically installs the base package.
Installing the ZooKeeper Base Package
To install ZooKeeper On Red Hat-compatible systems:
$ sudo yum install zookeeper
To install ZooKeeper on Ubuntu and other Debian systems:
$ sudo apt-get install zookeeper
To install ZooKeeper on SLES systems:
$ sudo zypper install zookeeper
Installing the ZooKeeper Server Package and Starting ZooKeeper on a Single Server
The instructions provided here deploy a single ZooKeeper server in "standalone" mode. This is appropriate for evaluation, testing and development purposes, but may not provide sufficient reliability for a production application. See Installing ZooKeeper in a Production Environment for more information.
To install the ZooKeeper Server On Red Hat-compatible systems:
$ sudo yum install zookeeper-server
To install a ZooKeeper server on Ubuntu and other Debian systems:
$ sudo apt-get install zookeeper-server
To install ZooKeeper on SLES systems:
$ sudo zypper install zookeeper-server
mkdir -p /var/lib/zookeeper chown -R zookeeper /var/lib/zookeeper/
To start ZooKeeper
- To start ZooKeeper after an upgrade:
$ sudo service zookeeper-server start
- To start ZooKeeper after a first-time install:
$ sudo service zookeeper-server init $ sudo service zookeeper-server start
Installing ZooKeeper in a Production Environment
In a production environment, you should deploy ZooKeeper as an ensemble with an odd number of servers. As long as a majority of the servers in the ensemble are available, the ZooKeeper service will be available. The minimum recommended ensemble size is three ZooKeeper servers, and Cloudera recommends that each server run on a separate machine. In addition, the ZooKeeper server process should have its own dedicated disk storage if possible.
- Use the commands under Installing the ZooKeeper Server Package and Starting ZooKeeper on a Single Server to install zookeeper-server on each host.
- Test the expected loads to set the Java heap size so as to avoid swapping. Make sure you are well below the threshold at which the system would start swapping; for example 12GB for a machine with 16GB of RAM.
- Create a configuration file. This file can be called anything you like, and must specify settings for at least the parameters shown under "Minimum Configuration" in the ZooKeeper Administrator's Guide. You should also configure values for initLimit, syncLimit, and server.n; see the explanations in the administrator's guide. For example:
tickTime=2000 dataDir=/var/lib/zookeeper/ clientPort=2181 initLimit=5 syncLimit=2 server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
In this example, the final three lines are in the form server.id=hostname:port:port. The first port is for a follower in the ensemble to listen on for the leader; the second is for leader election. You set id for each server in the next step.
- Create a file named myid in the server's DataDir; in this example, /var/lib/zookeeper/myid . The file must contain only a single line, and that line must consist of a single unique number between 1 and 255; this is the id component mentioned in the previous step. In this example, the server whose hostname is zoo1 must have a myid file that contains only 1.
- Start each server as described in the previous section.
- Test the deployment by running a ZooKeeper client:
zookeeper-client -server hostname:port
For example:zookeeper-client -server zoo1:2181
For more information on configuring a multi-server deployment, see Clustered (Multi-Server) Setup in the ZooKeeper Administrator's Guide.
Setting up Supervisory Process for the ZooKeeper Server
The ZooKeeper server is designed to be both highly reliable and highly available. This means that:
- If a ZooKeeper server encounters an error it cannot recover from, it will "fail fast" (the process will exit immediately)
- When the server shuts down, the ensemble remains active, and continues serving requests
- Once restarted, the server rejoins the ensemble without any further manual intervention.
Cloudera recommends that you fully automate this process by configuring a supervisory service to manage each server, and restart the ZooKeeper server process automatically if it fails. See the ZooKeeper Administrator's Guide for more information.