This is the documentation for Cloudera Manager 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Using Whirr to Launch Cloudera Manager

Cloudera Manager provides an installation wizard that installs Cloudera Manager, CDH and Impala on a cluster of Amazon Web Services (AWS) EC2 instances. See Installing Cloudera Manager and CDH on EC2 . Alternatively, you can install Cloudera Manager using Whirr following the instructions here. Follow these instructions to start a cluster on Amazon Elastic Compute Cloud (EC2) running Cloudera Manager.

This method uses Whirr to start a cluster with:

  • One host running the Cloudera Manager Admin Console
  • A user-selectable number of hosts for the Hadoop cluster itself.

Once Whirr has started the cluster, you use Cloudera Manager in the usual way.

Step 1: Set your AWS credentials as environment variables

Run the following commands from your local host:

$ export AWS_ACCESS_KEY_ID=...
$ export AWS_SECRET_ACCESS_KEY=...

Step 2: Install Whirr

Install CDH repositories and the whirr package. For CDH 4, see the CDH 4 Installation Guide. For CDH 5, see the CDH 5 Installation Guide.

Create environment variables:

$ export WHIRR_HOME=/usr/lib/whirr
$ export PATH=$WHIRR_HOME/bin:$PATH

Step 3: Create a password-less SSH Key Pair

Create a password-less SSH Key Pair for Whirr to use:

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa_cm

Step 4: Get your Whirr-Cloudera-Manager Configuration

You can download a sample Whirr EC2 Cloudera Manager configuration as follows:

$ curl -O https://raw.github.com/cloudera/whirr-cm/master/cm-ec2.properties

To upload a Cloudera Manager License as part of the installation (Cloudera can provide this if you do not have one), place the license in a file cm-license.txt on the Whirr classpath (for example in $WHIRR_HOME/conf), using a command such as the following:

$ mv -v eval_acme_20120925_cloudera_enterprise_license.txt $WHIRR_HOME/conf/cm-license.txt

To upload a Cloudera Manager configuration as part of the installation, place the configuration in a file called cm-config.json on the Whirr classpath (for example in $WHIRR_HOME/conf). The format of this file should match the JSON as downloaded from the Cloudera Manager UI. For example:

$ curl -O https://raw.github.com/cloudera/whirr-cm/master/cm-config.json
$ mv -v cm-config.json $WHIRR_HOME/conf/cm-config.json

Step 5: Launch a Cloudera Manager Cluster

The following command starts a cluster with five Hadoop hosts:

$ whirr launch-cluster --config cm-ec2.properties
  Note:
  • To change the number of hosts edit the whirr.instance-templates line in the cm-ec2.properties file. For example, to launch a cluster with 20 hosts: whirr.instance-templates=1 cmserver,20 cmagent
  • To add a no-op host to use as gateway host: whirr.instance-templates=1 cmserver,20 cmagent,1 noop

Whirr reports progress to the console as it runs. The command exits when the cluster is ready to be used.

Using the Cluster

Once the Hadoop cluster is up and running you can run jobs from any Cloudera Manager Agent host, or from a Cloudera Manager gateway host.

Using a Gateway Host (Optional)

In most cases, you will not a need a gateway host, but you may want to consider using one if you want to run jobs on a host that is not also running CDH TaskTracker and DataNode processes. In that case, edit whirr.instance-templates to use the noop option shown in the previous section, launch the cluster, and then follow Cloudera Manager instructions to add a gateway role on the no-op host, which you can find in the documentation for your version of Cloudera Manager, for example at Role Instances.

Then SSH to the gateway host. Now you can interact with the cluster; for example, to list files in HDFS:

hadoop fs -ls /tmp

Shutting Down the Cluster

When you want to shut down the cluster, run the following command.

  Important: All data and state stored on the cluster will be lost.
whirr destroy-cluster --config cm-ec2.properties
Page generated September 3, 2015.