Managing ZooKeeper
This topic describes how to add, remove, and replace ZooKeeper roles.
Continue reading:
- Using Multiple ZooKeeper Services
- Adding a ZooKeeper Service Using Cloudera Manager
- Replacing a Zookeeper Disk Using Cloudera Manager
- Replacing a ZooKeeper Role Using Cloudera Manager with Zookeeper Service Downtime
- Replacing a ZooKeeper Role Using Cloudera Manager Without Zookeeper Service Downtime
- Adding or Deleting a ZooKeeper Role on an Unmanaged Cluster
- Replacing a ZooKeeper Role on an Unmanaged Cluster
Using Multiple ZooKeeper Services
Cloudera Manager requires dependent services within CDH to use the same ZooKeeper service. If you configure dependent CDH services to use different ZooKeeper services, Cloudera Manager reports the following error:
com.cloudera.cmf.command.CmdExecException:java.lang.RuntimeException: java.lang.IllegalStateException: Assumption violated: getAllDependencies returned multiple distinct services of the same type at SeqFlowCmd.java line 120 in com.cloudera.cmf.command.flow.SeqFlowCmd run()
CDH services that are not dependent can use different ZooKeeper services. For example, Kafka does not depend on any services other than ZooKeeper. You might have one ZooKeeper service for Kafka, and one ZooKeeper service for the rest of your CDH services.
Adding a ZooKeeper Service Using Cloudera Manager
Minimum Required Role: Full Administrator
When adding the ZooKeeper service, the Add Service wizard automatically initializes the data directories.
When you add Zookeeper servers to an existing ensemble, a rolling restart of all zookeeper is required in order to allow all zookeeper servers to have the same configurations
- Go to the ZooKeeper service.
- Select .
- Click Initialize again to confirm.
In a production environment, you should deploy ZooKeeper as an ensemble with an odd number of servers. As long as a majority of the servers in the ensemble are available, the ZooKeeper service will be available. The minimum recommended ensemble size is three ZooKeeper servers, and Cloudera recommends that each server run on a separate machine. In addition, the ZooKeeper server process should have its own dedicated disk storage if possible.
Replacing a Zookeeper Disk Using Cloudera Manager
Minimum Required Role: Full Administrator
- In Cloudera Manager, update the Data Directory and Transaction Log Directory settings.
- Stop a single ZooKeeper role.
- Move the contents to the new disk location (modify mounts as needed). Make sure the permissions and ownership are correct.
- Start the ZooKeeper role.
- Repeat steps 2-4 for any remaining ZooKeeper roles.
Replacing a ZooKeeper Role Using Cloudera Manager with Zookeeper Service Downtime
Minimum Required Role: Full Administrator
- Go to ZooKeeper Instances.
- Stop the ZooKeeper role on the old host.
- Remove the ZooKeeper role from old host on the ZooKeeper Instances page.
- Add a new ZooKeeper role on the new host.
- Restart the old ZooKeeper servers that have outdated configuration.
- Restart the newly added Zookeeper server.
- Restart/rolling restart any dependent services such as HBase, HDFS, YARN, Hive, or other services that are marked to have stale configuration.
Replacing a ZooKeeper Role Using Cloudera Manager Without Zookeeper Service Downtime
Minimum Required Role: Full Administrator
- Go to ZooKeeper Instances.
- Stop the ZooKeeper role on the old host.
- Confirm the ZooKeeper service has elected one of the remaining hosts as a leader on the ZooKeeper Status page. See Confirming the Election Status of a ZooKeeper Service.
- On the ZooKeeper Instances page, remove the ZooKeeper role from the old host.
- Add a new ZooKeeper role on the new host.
- Change the individual configuration of the newly added Zookeeper role to have the highest ZooKeeper Server ID set in the cluster.
- Go to Server instance. and click the newly added
- In the individual Server page, select Start this Server from the Actions dropdown menu to start the new ZooKeeper role.
- On the ZooKeeper Status page, confirm that there is a leader and all other hosts are followers.
- Restart the ZooKeeper server that has an outdated configuration and is a follower.
- Restart the leader Zookeeper server that has an outdated configuration.
- Confirm that a leader has been elected after the restart, and the whole Zookeeper service is in green state.
- Restart/rolling restart any dependent services such as HBase, HDFS, YARN, Hive, or other services that are marked to have stale configuration.
Adding or Deleting a ZooKeeper Role on an Unmanaged Cluster
Minimum Required Role: Full Administrator
For information on administering ZooKeeper from the command line, see the ZooKeeper Getting Started Guide.
Replacing a ZooKeeper Role on an Unmanaged Cluster
Minimum Required Role: Full Administrator
These instructions assume you are using ZooKeeper from the command line. For more information, see the ZooKeeper Getting Started Guide.
- Stop the ZooKeeper role on the old host.
- Confirm the ZooKeeper Quorum has elected a leader. See Confirming the Election Status of a ZooKeeper Service.
- Add a new ZooKeeper role on the new server.
- Identify the dataDir location from the zoo.cfg file. This defaults to /var/lib/zookeeper.
- Identify the ID number for the ZooKeeper Server from the myid file in the configuration: cat /var/lib/zookeeper/myid
- On all the ZooKeeper hosts, edit the zoo.cfg file so the server ID references the new server hostname. For example:
server.1=zk1.example.org:3181:4181 server.2=zk2.example.org:3181:4181 server.4=zk4.example.org:3181:4181
- Restart the ZooKeeper hosts.
- Confirm the ZooKeeper Quorum has elected a leader and the other hosts are followers. See Confirming the Election Status of a ZooKeeper Service.
- Restart any dependent services such as HBase, HDFS Failover Controllers with HDFS High Availability, or YARN or Mapreduce v1 with High Availability.
- Perform a failover to make one HDFS NameNode active. See Manually Failing Over to the Standby NameNode.
Confirming the Election Status of a ZooKeeper Service
echo "stat" | nc server.example.org 2181 | grep Mode
For example, a follower host would return the message:
Mode: follower
telnet server.example.org 2181
Trying 10.1.2.154... Connected to server.example.org. Escape character is '^]'. stat Zookeeper version: 3.4.5-cdh5.4.4--1, built on 07/06/2015 23:54 GMT ... Latency min/avg/max: 0/1/40 Received: 631 Sent: 677 Connections: 7 Outstanding: 0 Zxid: 0x30000011a Mode: follower <---- Node count: 40 Connection closed by foreign host.