Installing CDH 5
To upgrade to the latest CDH 5 release, perform the following steps.
Add or Build the CDH 5 Repository or Download the "1-click Install" package.
- If you are installing CDH 5 on a Red Hat system, you can download Cloudera packages using yum or your web browser.
- If you are installing CDH 5 on a SLES system, you can download the Cloudera packages using zypper or YaST or your web browser.
- If you are installing CDH 5 on an Ubuntu or Debian system, you can download the Cloudera packages using apt or your web browser.
On Red Hat-compatible Systems
Use only one of the three methods.
- Download and install the CDH 5 "1-click Install" package OR
- Add the CDH 5 repository OR
- Build a Yum Repository
Do this on all the systems in the cluster.
To download and install the CDH 5 "1-click Install" package:
- Click the entry in the table below that matches your Red Hat or
CentOS system, choose Save File, and save the file to
a directory to which you have write access (it can be your home directory).
OS Version Click this Link Red Hat/CentOS/Oracle 5 Red Hat/CentOS/Oracle 5 link Red Hat/CentOS/Oracle 6 Red Hat/CentOS/Oracle 6 link - Install the RPM. For Red Hat/CentOS/Oracle 5:
$ sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
For Red Hat/CentOS/Oracle 6 (64-bit):
$ sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
Click the entry in the table below that matches your Red Hat or CentOS system, navigate to the repo file for your system and save it in the /etc/yum.repos.d/ directory.
For OS Version |
Click this Link |
---|---|
Red Hat/CentOS/Oracle 5 |
|
Red Hat/CentOS/Oracle 6 (64-bit) |
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a Yum repository:
If you want to create your own yum repository, download the appropriate repo file, create the repo, distribute the repo file and set up a web server, as described under Creating a Local Yum Repository.
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
On SLES Systems
Use only one of the three methods.
- Download and install the CDH 5 "1-click Install" PackageOR
- Add the CDH 5 repositoryOR
- Build a SLES Repository
To download and install the CDH 5 "1-click Install" package:
- Download the CDH 5 "1-click Install" package.
Click this link, choose Save File, and save it to a directory to which you have write access (it can be your home directory).
- Install the RPM:
$ sudo rpm -i cloudera-cdh-5-0.x86_64.rpm
- Update your system package index by running:
$ sudo zypper refresh
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
- Run the following command:
$ sudo zypper addrepo -f http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/cloudera-cdh5.repo
- Update your system package index by running:
$ sudo zypper refresh
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a SLES repository:
If you want to create your own SLES repository, create a mirror of the CDH SLES directory by following these instructions that explain how to create a SLES repository from the mirror.
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
On Ubuntu or Debian Systems
Use only one of the three methods.
- Download and install the CDH 5 "1-click Install" Package OR
- Add the CDH 5 repositoryOR
- Build a Debian Repository
To download and install the CDH 5 "1-click Install" package:
- Download the CDH 5 "1-click Install" package:
OS Version Click this Link Wheezy Wheezy link Precise Precise link - Install the package. Do one of the following:
- Choose Open with in the download window to use the package manager.
- Choose Save File, save the package to a directory to which you have write access (it can be your home directory) and install it from the command line, for example:
sudo dpkg -i cdh5-repository_1.0_all.deb
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
Create a new file /etc/apt/sources.list.d/cloudera.list with the following contents:
- For Ubuntu systems:
deb [arch=amd64] http://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib deb-src http://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib
- For Debian systems:
deb http://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib deb-src http://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib
where: <OS-release-arch> is debian/wheezy/amd64/cdh or ubuntu/precise/amd64/cdh, and <RELEASE> is the name of your distribution, which you can find by running lsb_release -c.
For example, to install CDH 5 for 64-bit Ubuntu Precise:
deb [arch=amd64] http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5 contrib deb-src http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5 contrib
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a Debian repository:
If you want to create your own apt repository, create a mirror of the CDH Debian directory and then create an apt repository from the mirror.
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Install CDH 5 with YARN, or Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
Optionally Add a Repository Key
Before installing YARN or MRv1: (Optionally) add a repository key on each system in the cluster. Add the Cloudera Public GPG Key to your repository by executing one of the following commands:
- For Red Hat/CentOS/Oracle 5
systems:
$ sudo rpm --import http://archive.cloudera.com/cdh5/redhat/5/x86_64/cdh/RPM-GPG-KEY-cloudera
- For Red Hat/CentOS/Oracle 6
systems:
$ sudo rpm --import http://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
- For all SLES systems:
$ sudo rpm --import http://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera
- For Ubuntu Precise systems:
$ curl -s http://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
- For Debian Wheezy systems:
$ curl -s http://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh/archive.key | sudo apt-key add -
This key enables you to verify that you are downloading genuine packages.
Install CDH 5 with YARN
Skip this step if you intend to use only MRv1. Directions for installing MRv1 are in Step 3.
To install CDH 5 with YARN:
If you decide to configure HA for the NameNode, do not install hadoop-hdfs-secondarynamenode. After completing the HA software configuration, follow the installation instructions under Deploying HDFS High Availability.
- Install and deploy ZooKeeper.
Important
: Cloudera recommends that you install (or update) and start a ZooKeeper cluster before proceeding. This is a requirement if you are deploying high availability (HA) for the NameNode.
Follow instructions under ZooKeeper Installation.
- Install each type of daemon package on the appropriate systems(s), as
follows.
Where to install
Install commands
Resource Manager host (analogous to MRv1 JobTracker) running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-yarn-resourcemanager
SLES
sudo zypper clean --all; sudo zypper install hadoop-yarn-resourcemanager
Ubuntu or Debian
sudo apt-get update; sudo apt-get install hadoop-yarn-resourcemanager
NameNode host running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-namenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-namenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-namenode
Secondary NameNode host (if used) running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-secondarynamenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-secondarynamenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-secondarynamenode
All cluster hosts except the Resource Manager running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
SLES
sudo zypper clean --all; sudo zypper install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
Ubuntu or Debian
sudo apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
One host in the cluster running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
SLES
sudo zypper clean --all; sudo zypper install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
Ubuntu or Debian
sudo apt-get install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
All client hosts running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-client
SLES
sudo zypper clean --all; sudo zypper install hadoop-client
Ubuntu or Debian
sudo apt-get install hadoop-client
The hadoop-yarn and hadoop-hdfs packages are installed on each system automatically as dependencies of the other packages.
Install CDH 5 with MRv1
If you are also installing YARN, you can skip any packages you have already installed in Install CDH 5 with YARN.
Skip this step and go to Install CDH 5 with YARN if you intend to use only YARN.
Before proceeding, you need to decide:
- Whether to configure High Availability (HA) for the NameNode and/or JobTracker; see the CDH 5 High Availability Guide for more information and instructions.
- Where to deploy the NameNode, Secondary NameNode, and JobTracker
daemons. As a general rule:
- The NameNode and JobTracker run on the same "master" host unless the cluster is large (more than a few tens of nodes), and the master host (or hosts) should not run the Secondary NameNode (if used), DataNode or TaskTracker services.
- In a large cluster, it is especially important that the Secondary NameNode (if used) runs on a separate machine from the NameNode.
- Each node in the cluster except the master host(s) should run the DataNode and TaskTracker services.
If you decide to configure HA for the NameNode, do not install hadoop-hdfs-secondarynamenode. After completing the HA software configuration, follow the installation instructions under Deploying HDFS High Availability.
- Install and deploy ZooKeeper. Important
: Cloudera recommends that you install (or update) and start a ZooKeeper cluster before proceeding. This is a requirement if you are deploying high availability (HA) for the NameNode or JobTracker.
Follow instructions under ZooKeeper Installation.
- Install each type of daemon package on the appropriate systems(s),
as follows.
Where to install
Install commands
JobTracker host running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-0.20-mapreduce-jobtracker
SLES
sudo zypper clean --all; sudo zypper install hadoop-0.20-mapreduce-jobtracker
Ubuntu or Debian
sudo apt-get update; sudo apt-get install hadoop-0.20-mapreduce-jobtracker
NameNode host running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-namenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-namenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-namenode
Secondary NameNode host (if used) running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-secondarynamenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-secondarynamenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-secondarynamenode
All cluster hosts except the JobTracker, NameNode, and Secondary (or Standby) NameNode hosts running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
SLES
sudo zypper clean --all; sudo zypper install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
Ubuntu or Debian
sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode
All client hosts running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-client
SLES
sudo zypper clean --all; sudo zypper install hadoop-client
Ubuntu or Debian
sudo apt-get install hadoop-client
(Optional) Install LZO
yum remove hadoop-lzo
- Add the repository on each node in the cluster.Follow the instructions for your OS version:
For OS Version
Do this
Red Hat/CentOS/Oracle 5
Navigate to this link and save the file in the /etc/yum.repos.d/ directory.
Red Hat/CentOS 6
Navigate to this link and save the file in the /etc/yum.repos.d/ directory.
SLES
- Run the following
command:
$ sudo zypper addrepo -f http://archive.cloudera.com/gplextras5/sles/11/x86_64/gplextras/ cloudera-gplextras5.repo
- Update your system package index by
running:
$ sudo zypper refresh
Ubuntu or Debian
Navigate to this link and save the file as /etc/apt/sources.list.d/gplextras.list. Important: Make sure you do not let the file name default to cloudera.list, as that will overwrite your existing cloudera.list. - Run the following
command:
- Install the package on each node as follows:
For OS version
Install commands
Red Hat/CentOS compatible
sudo yum install hadoop-lzo
SLES
sudo zypper install hadoop-lzo
Ubuntu or Debian
sudo apt-get install hadoop-lzo
- Continue with installing and deploying CDH. As part of the
deployment, you will need to do some additional configuration for LZO, as shown under
Configuring LZO . Important
: Make sure you do this configuration after you have copied the default configuration files to a custom location and set alternatives to point to it.
Deploy CDH and Install Components
Now proceed with:
<< Before You Begin Installing CDH 5 Manually | Installing CDH 5 Components >> | |