Step 2: Installing and Configuring Cloudera Manager Server for High Availability

You can use an existing Cloudera Manager installation and extend it to a high-availability configuration, as long as you are not using the embedded PostgreSQL database.

This section describes how to install and configure a failover secondary for Cloudera Manager Server that can take over if the primary fails.

Setting up NFS Mounts for Cloudera Manager Server

  1. Create the following directories on the NFS server you created in a previous step:
    $ mkdir -p /media/cloudera-scm-server
  2. Mark these mounts by adding these lines to the /etc/exports file on the NFS server:
    /media/cloudera-scm-server CMS1(rw,sync,no_root_squash,no_subtree_check)
    /media/cloudera-scm-server CMS2(rw,sync,no_root_squash,no_subtree_check)
  3. Export the mounts by running the following command on the NFS server:
    $ exportfs -a
  4. Set up the filesystem mounts on CMS1 and CMS2 hosts:
    1. If you are updating an existing installation for high availability, stop the Cloudera Manager Server if it is running on either of the CMS1 or CMS2 hosts by running the following command:
      $ service cloudera-scm-server stop
    2. Make sure that the NFS mount helper is installed:
      RHEL/CentOS:
      $ yum install nfs-utils-lib
      Ubuntu:
      $ apt-get install nfs-common
      SUSE:
      $ zypper  install nfs-client
    3. Make sure that rpcbind is running and has been restarted:
      $ service rpcbind restart
      
  5. Create the mount points on both CMS1 and CMS2:
    1. If you are updating an existing installation for high availability, copy the /var/lib/cloudera-scm-server file from your existing Cloudera Manager Server host to the NFS server with the following command (NFS refers to the NFS server you created in a previous step):
      $ scp -r /var/lib/cloudera-scm-server/ NFS:/media/cloudera-scm-server
      
    2. Set up the /var/lib/cloudera-scm-server directory on the CMS1 and CMS2 hosts:
      $ rm -rf /var/lib/cloudera-scm-server
      $ mkdir -p /var/lib/cloudera-scm-server
      
    3. Mount the following directory to the NFS mounts, on both CMS1 and CMS2:
      $ mount -t nfs NFS:/media/cloudera-scm-server /var/lib/cloudera-scm-server
      
    4. Set up fstab to persist the mounts across restarts by editing the /etc/fstab file on CMS1 and CMS2 and adding the following lines:
      NFS:/media/cloudera-scm-server /var/lib/cloudera-scm-server nfs
      auto,noatime,nolock,intr,tcp,actimeo=1800 0 0
      

Installing the Primary

Updating an Existing Installation for High Availability

You can retain your existing Cloudera Manager Server as-is, if the deployment meets the following conditions:
  • The Cloudera Management Service is located on a single host that is not the host where Cloudera Manager Server runs.
  • The data directories for the roles of the Cloudera Management Service are located on a remote storage device (such as an NFS store), and they can be accessed from both primary and secondary installations of the Cloudera Management Service.
If your deployment does not meet these conditions, Cloudera recommends that you uninstall Cloudera Management Services by stopping the existing service and deleting it.
To delete and remove the Cloudera Management Service:
  1. Open the Cloudera Manager Admin Console and go to the Home page.
  2. Click Cloudera Management Service > Stop.
  3. Click Cloudera Management Service > Delete.

Fresh Installation

Follow the instructions in Installing Cloudera Manager, CDH, and Managed Services to install Cloudera Manager Server, but do not add “Cloudera Management Service” to your deployment until you complete Step 3: Installing and Configuring Cloudera Management Service for High Availability, which describes how to set up the Cloudera Management Service.

You can now start the freshly-installed Cloudera Manager Server on CMS1:
$ service cloudera-scm-server start

Before proceeding, verify that you can access the Cloudera Manager Admin Console at http://CMS1:7180.

If you have just installed Cloudera Manager, click the Cloudera Manager logo to skip adding new hosts and to gain access to the Administration menu, which you need for the following steps.

HTTP Referer Configuration

Cloudera recommends that you disable the HTTP Referer check because it causes problems for some proxies and load balancers. Check the configuration manual of your proxy or load balancer to determine if this is necessary.

To disable HTTP Referer in the Cloudera Manager Admin Console:
  1. Select Administration > Settings.
  2. Select Category > Security.
  3. Clear the HTTP Referer Check property.

Before proceeding, verify that you can access the Cloudera Manager Admin Console through the load balancer at http://CMSHostname:7180.

TLS and Kerberos Configuration

To configure Cloudera Manager to use TLS encryption or authentication, or to use Kerberos authentication, see TLS and Kerberos Configuration for Cloudera Manager High Availability.

Installing the Secondary

Setting up the Cloudera Manager Server secondary requires copying certain files from the primary to ensure that they are consistently initialized.

  1. On the CMS2 host, install the cloudera-manager-server package using Installing Cloudera Manager, CDH, and Managed Services.
  2. When setting up the database on the secondary, copy the /etc/cloudera-scm-server/db.properties file from host CMS1 to host CMS2 at /etc/cloudera-scm-server/db.properties. For example:
    $ mkdir -p /etc/cloudera-scm-server
    $ scp [<ssh-user>@]CMS1:/etc/cloudera-scm-server/db.properties /etc/cloudera-scm-server/db.properties
  3. If you configured Cloudera Manager TLS encryption or authentication, or Kerberos authentication in your primary installation, see TLS and Kerberos Configuration for Cloudera Manager High Availability for additional configuration steps.
  4. Do not start the cloudera-scm-server service on this host yet, and disable autostart on the secondary to avoid automatically starting the service on this host.

    RHEL/CentOS/SUSEL:

    $ chkconfig cloudera-scm-server off
    Ubuntu:
    $ update-rc.d -f cloudera-scm-server remove
    (You will also disable autostart on the primary when you configure automatic failover in a later step.) Data corruption can result if both primary and secondary Cloudera Manager Server instances are running at the same time, and it is not supported. :

Testing Failover

Test failover manually by using the following steps:

  1. Stop cloudera-scm-server on your primary host (CMS1):
    $ service cloudera-scm-server stop
  2. Start cloudera-scm-server on your secondary host (CMS2):
    $ service cloudera-scm-server start
  3. Wait a few minutes for the service to load, and then access the Cloudera Manager Admin Console through a web browser, using the load-balanced hostname (for example: http://CMSHostname:CMS_port).

Now, fail back to the primary before configuring the Cloudera Management Service on your installation:

  1. Stop cloudera-scm-server on your secondary machine (CMS2):
    $ service cloudera-scm-server stop
  2. Start cloudera-scm-server on your primary machine (CMS1):
    $ service cloudera-scm-server start
  3. Wait a few minutes for the service to load, and then access the Cloudera Manager Admin Console through a web browser, using the load-balanced hostname (for example: http://CMSHostname:7180).

Updating Cloudera Manager Agents to use the Load Balancer

After completing the primary and secondary installation steps listed previously, update the Cloudera Manager Agent configuration on all of the hosts associated with this Cloudera Manager installation, except the MGMT1, MGMT2, CMS1, and CMS2 hosts, to use the load balancer address:

  1. Connect to a shell on each host where CDH processes are installed and running. (The MGMT1, MGMT2, CMS1, and CMS2 hosts do not need to be modified as part of this step.)
  2. Update the /etc/cloudera-scm-agent/config.ini file and change the server_host line:
    server_host = <CMSHostname>
  3. Restart the agent (this command starts the agents if they are not running):
    $ service cloudera-scm-agent restart