This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Before You Install CDH 5 on a Cluster

Note: Running Services

When starting, stopping and restarting CDH components, always use the service (8) command rather than running scripts in /etc/init.d directly. This is important because service sets the current working directory to / and removes most environment variables (passing only LANG and TERM) so as to create a predictable environment in which to administer the service. If you run the scripts in /etc/init.d, any environment variables you have set remain in force, and could produce unpredictable results. (If you install CDH from packages, service will be installed as part of the Linux Standard Base (LSB).)

Important:

Upgrading from CDH 4: If you are upgrading from CDH 4, you must first uninstall CDH 4, then install CDH 5; see Upgrading from CDH 4 to CDH 5.

Before you install CDH 5 on a cluster, there are some important steps you need to do to prepare your system:

Verify you are using a supported operating system for CDH 5. See CDH 5 Requirements and Supported Versions.
If you haven't already done so, install the Oracle Java Development Kit. For instructions and recommendations, see Java Development Kit Installation.

Important:

On SLES 11 platforms, do not install or try to use the IBM Java version bundled with the SLES distribution; Hadoop will not run correctly with that version. Install the Oracle JDK following directions under Java Development Kit Installation.

Note:

If you are migrating from MapReduce v1 (MRv1) to MapReduce v2 (MRv2, YARN), see Migrating from MapReduce v1 (MRv1) to MapReduce v2 (MRv2, YARN) for important information and instructions.

High Availability

In CDH 5 you can configure high availability both for the NameNode and the JobTracker or Resource Manager.

For more information and instructions on setting up a new HA configuration, see the CDH 5 High Availability Guide.
Important:
If you decide to configure HA for the NameNode, do not install hadoop-hdfs-secondarynamenode. After completing the HDFS HA software configuration, follow the installation instructions under Deploying HDFS High Availability.
To upgrade an existing configuration, follow the instructions under Upgrading to CDH 5.

Page generated September 3, 2015.