Installing the Latest CDH 5 Release
To upgrade to the latest CDH 5 release, use the following topics.
Ways To Install CDH 5
You can install CDH 5 in any of the following ways:
- Install Cloudera Manager, CDH, and managed services in a Cloudera Manager Deployment.
- Or use one of the manual methods described below:
- Download and install the CDH 5 1-click Install" package; OR
- Add the CDH 5 repository; OR
- Build your own CDH 5 repository
If you use one of these manual methods rather than Cloudera Manager, the first (downloading and installing the "1-click Install" package) is recommended in most cases because it is simpler than building or adding a repository.
- Install from a CDH 5 tarball — see, the next topic, "How Packaging Affects CDH 5 Deployment".
How Packaging Affects CDH 5 Deployment
Installing from Packages
- To install and deploy YARN, follow the directions on this page and proceed with Deploying MapReduce v2 (YARN) on a Cluster.
- To install and deploy MRv1, follow the directions on this page and then proceed with Deploying MapReduce v1 (MRv1) on a Cluster.
Installing from a Tarball
- If you install CDH 5 from a tarball, you will install YARN.
- In CDH 5, there is no separate tarball for MRv1. Instead, the MRv1 binaries, examples, etc., are delivered in the Hadoop tarball itself. The scripts for running MRv1 are in the bin-mapreduce1 directory in the tarball, and the MRv1 examples are in the examples-mapreduce1 directory.
Before You Begin Installing CDH 5 Manually
- For a list of supported operating systems, see Requirements and Supported Versions.
- These instructions assume that the sudo command is configured on the hosts where you will be doing the installation. If this is not the case, you will need the root user (superuser) to configure it.
High Availability
- For more information and instructions on setting up a new HA configuration, see High Availability.
Steps to Install CDH 5 Manually
Step 1: Add or Build the CDH 5 Repository or Download the "1-click Install" package.
- If you are installing CDH 5 on a Red Hat system, you can download Cloudera packages using yum or your web browser.
- If you are installing CDH 5 on a SLES system, you can download the Cloudera packages using zypper or YaST or your web browser.
- If you are installing CDH 5 on an Ubuntu or Debian system, you can download the Cloudera packages using apt or your web browser.
On Red Hat-compatible Systems
- Download and install the CDH 5 "1-click Install" package OR
- Add the CDH 5 repository OR
- Build a Yum Repository
Do this on all the systems in the cluster.
To download and install the CDH 5 "1-click Install" package:
- Click the entry in the table below that matches your Red Hat or CentOS system, choose Save File, and save the file to a directory to which you have write
access (it can be your home directory).
OS Version Click this Link Red Hat/CentOS/Oracle 5 Red Hat/CentOS/Oracle 5 link Red Hat/CentOS/Oracle 6 Red Hat/CentOS/Oracle 6 link - Install the RPM. For Red Hat/CentOS/Oracle 5:
$ sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
For Red Hat/CentOS/Oracle 6 (64-bit):
$ sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
Click the entry in the table below that matches your RHEL or CentOS system, navigate to the repo file for your system and save it in the /etc/yum.repos.d/ directory.
For OS Version |
Click this Link |
---|---|
RHEL/CentOS/Oracle 5 |
|
RHEL/CentOS/Oracle 6 (64-bit) |
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a Yum repository:
If you want to create your own yum repository, download the appropriate repo file, create the repo, distribute the repo file and set up a web server, as described under Creating a Local Yum Repository.
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
On SLES Systems
- Download and install the CDH 5 "1-click Install" Package OR
- Add the CDH 5 repository OR
- Build a SLES Repository
To download and install the CDH 5 "1-click Install" package:
- Download the CDH 5 "1-click Install" package.
Click this link, choose Save File, and save it to a directory to which you have write access (for example, your home directory).
- Install the RPM:
$ sudo rpm -i cloudera-cdh-5-0.x86_64.rpm
- Update your system package index by running:
$ sudo zypper refresh
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
- Run the following command:
$ sudo zypper addrepo -f https://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/cloudera-cdh5.repo
- Update your system package index by running:
$ sudo zypper refresh
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a SLES repository:
If you want to create your own SLES repository, create a mirror of the CDH SLES directory by following these instructions that explain how to create a SLES repository from the mirror.
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
On Ubuntu or Debian Systems
Use one of the following methods to download the CDH 5 repository or package.
- Download and install the CDH 5 "1-click Install" Package OR
- Add the CDH 5 repository OR
- Build a Debian Repository
To download and install the CDH 5 "1-click Install" package:
- Download the CDH 5 "1-click Install" package:
OS Version Click this Link Wheezy Wheezy link Precise Precise link Trusty Trusty link - Install the package by doing one of the following:
- Choose Open with in the download window to use the package manager.
- Choose Save File, save the package to a directory to which you have write access (for example, your home directory), and install it from the command line.
For example:
sudo dpkg -i cdh5-repository_1.0_all.deb
Now continue with Step 2: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To add the CDH 5 repository:
Create a new file /etc/apt/sources.list.d/cloudera.list with the following contents:
- For Ubuntu systems:
deb [arch=amd64] https://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib deb-src https://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib
- For Debian systems:
deb https://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib deb-src https://archive.cloudera.com/cdh5/<OS-release-arch><RELEASE>-cdh5 contrib
where: <OS-release-arch> is debian/wheezy/amd64/cdh or ubuntu/precise/amd64/cdh, and <RELEASE> is the name of your distribution, which you can find by running lsb_release -c.
deb [arch=amd64] https://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5 contrib deb-src https://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh precise-cdh5 contrib
Additional step for Trusty
This step ensures that you get the right ZooKeeper package for the current CDH release. You need to prioritize the Cloudera repository you have just added, such that you install the CDH version of ZooKeeper rather than the version that is bundled with Ubuntu Trusty.
Package: * Pin: release o=Cloudera, l=Cloudera Pin-Priority: 501
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
OR: To build a Debian repository:
If you want to create your own apt repository, create a mirror of the CDH Debian directory and then create an apt repository from the mirror.
Now continue with Step 1a: Optionally Add a Repository Key, and then choose Step 3: Install CDH 5 with YARN, or Step 4: Install CDH 5 with MRv1; or do both steps if you want to install both implementations.
Step 2: Optionally Add a Repository Key
Before installing YARN or MRv1: (Optionally) add a repository key on each system in the cluster. Add the Cloudera Public GPG Key to your repository by executing one of the following commands:
- For Red Hat/CentOS/Oracle 5 systems:
$ sudo rpm --import https://archive.cloudera.com/cdh5/redhat/5/x86_64/cdh/RPM-GPG-KEY-cloudera
- For Red Hat/CentOS/Oracle 6 systems:
$ sudo rpm --import https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
- For all SLES systems:
$ sudo rpm --import https://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera
- For Ubuntu Precise systems:
$ curl -s https://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
- For Debian Wheezy systems:
$ curl -s https://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh/archive.key | sudo apt-key add -
This key enables you to verify that you are downloading genuine packages.
Step 3: Install CDH 5 with YARN
To install CDH 5 with YARN:
- Install and deploy ZooKeeper.
Follow instructions under ZooKeeper Installation.
- Install each type of daemon package on the appropriate systems(s), as follows.
Where to install
Install commands
Resource Manager host (analogous to MRv1 JobTracker) running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-yarn-resourcemanager
SLES
sudo zypper clean --all; sudo zypper install hadoop-yarn-resourcemanager
Ubuntu or Debian
sudo apt-get update; sudo apt-get install hadoop-yarn-resourcemanager
NameNode host running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-namenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-namenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-namenode
Secondary NameNode host (if used) running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-hdfs-secondarynamenode
SLES
sudo zypper clean --all; sudo zypper install hadoop-hdfs-secondarynamenode
Ubuntu or Debian
sudo apt-get install hadoop-hdfs-secondarynamenode
All cluster hosts except the Resource Manager running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
SLES
sudo zypper clean --all; sudo zypper install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
Ubuntu or Debian
sudo apt-get install hadoop-yarn-nodemanager hadoop-hdfs-datanode hadoop-mapreduce
One host in the cluster running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
SLES
sudo zypper clean --all; sudo zypper install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
Ubuntu or Debian
sudo apt-get install hadoop-mapreduce-historyserver hadoop-yarn-proxyserver
All client hosts running:
Red Hat/CentOS compatible
sudo yum clean all; sudo yum install hadoop-client
SLES
sudo zypper clean --all; sudo zypper install hadoop-client
Ubuntu or Debian
sudo apt-get install hadoop-client
Step 4: Install CDH 5 with MRv1
Follow instructions under ZooKeeper Installation. Make sure you create the myid file in the data directory, as instructed, if you are starting a ZooKeeper ensemble after a fresh install.
Next, install packages.
Where to install |
Install commands |
---|---|
JobTracker host running: |
|
Red Hat/CentOS compatible |
sudo yum clean all; sudo yum install hadoop-0.20-mapreduce-jobtracker |
SLES |
sudo zypper clean --all; sudo zypper install hadoop-0.20-mapreduce-jobtracker |
Ubuntu or Debian |
sudo apt-get update; sudo apt-get install hadoop-0.20-mapreduce-jobtracker |
NameNode host running: |
|
Red Hat/CentOS compatible |
sudo yum clean all; sudo yum install hadoop-hdfs-namenode |
SLES |
sudo zypper clean --all; sudo zypper install hadoop-hdfs-namenode |
Ubuntu or Debian |
sudo apt-get install hadoop-hdfs-namenode |
Secondary NameNode host (if used) running: |
|
Red Hat/CentOS compatible |
sudo yum clean all; sudo yum install hadoop-hdfs-secondarynamenode |
SLES |
sudo zypper clean --all; sudo zypper install hadoop-hdfs-secondarynamenode |
Ubuntu or Debian |
sudo apt-get install hadoop-hdfs-secondarynamenode |
All cluster hosts except the JobTracker, NameNode, and Secondary (or Standby) NameNode hosts running: |
|
Red Hat/CentOS compatible |
sudo yum clean all; sudo yum install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode |
SLES |
sudo zypper clean --all; sudo zypper install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode |
Ubuntu or Debian |
sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode |
All client hosts running: |
|
Red Hat/CentOS compatible |
sudo yum clean all; sudo yum install hadoop-client |
SLES |
sudo zypper clean --all; sudo zypper install hadoop-client |
Ubuntu or Debian |
sudo apt-get install hadoop-client |
Step 5: (Optional) Install LZO
- Add the repository on each host in the cluster. Follow the instructions for your OS version:
For OS Version Do this Red Hat/CentOS/Oracle 5 Navigate to this link and save the file in the /etc/yum.repos.d/ directory. Red Hat/CentOS 6 Navigate to this link and save the file in the /etc/yum.repos.d/ directory. SLES - Run the following command:
$ sudo zypper addrepo -f https://archive.cloudera.com/gplextras5/sles/11/x86_64/gplextras/ cloudera-gplextras5.repo
- Update your system package index by running:
$ sudo zypper refresh
Ubuntu or Debian Navigate to this link and save the file as /etc/apt/sources.list.d/gplextras.list. - Run the following command:
- Install the package on each host as follows:
For OS version Install commands Red Hat/CentOS compatible sudo yum install hadoop-lzo
SLES sudo zypper install hadoop-lzo
Ubuntu or Debian sudo apt-get install hadoop-lzo
- Continue with installing and deploying CDH. As part of the deployment, you will need to do some additional configuration for LZO, as shown under Configuring LZO.
Step 6: Deploy CDH and Install Components
Now proceed with: