Installing Lower Versions of Cloudera Manager 5

When you install Cloudera Manager—for example, by using the installer downloadable from the Cloudera Downloads website—the most recent version is installed by default. This ensures that you install the latest features and bug fixes. In some cases, however, you may want to install a lower version.

For example, you might install a lower version if you want to expand an existing cluster. In this case, follow the instructions in Adding a Host to the Cluster.

You can also add a cluster to be managed by the same instance of Cloudera Manager by using the Add Cluster feature from the Services page in the Cloudera Manager Admin Console. Follow the instructions in Adding a Cluster.

You may also want to install a lower version of the Cloudera Manager Server on a new cluster if, for example, you have validated a specific version and want to deploy that version on additional clusters. Installing an older version of Cloudera Manager requires several manual steps to install and configure the database and the correct version of the Cloudera Manager Server. After completing these steps, run the Installation wizard to complete the installation of Cloudera Manager and CDH.

Before You Begin

Install and Configure Databases

Cloudera Manager Server, Cloudera Management Service, and the Hive metastore data are stored in a database. Install and configure required databases following the instructions in Cloudera Manager and Managed Service Datastores.

(CDH 5 only) On RHEL 5 and CentOS 5, Install Python 2.6 or 2.7

CDH 5 Hue will only work with the default system Python version of the operating system it is being installed on. For example, on RHEL/CentOS 6 you will need Python 2.6 to start Hue.
To install packages from the EPEL repository, download the appropriate repository rpm packages to your machine and then install Python using yum. For example, use the following commands for RHEL 5 or CentOS 5:
$ su -c 'rpm -Uvh http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm'
...
$ yum install python26

Establish Your Cloudera Manager Repository Strategy

  • Download and Edit the Repo File for RHEL-compatible OSs or SLES
    1. Download the Cloudera Manager repo file (cloudera-manager.repo) for your OS version using the links provided on the Cloudera Manager Version and Download Information page. For example, for Red Hat/CentOS 6, the file is located at https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/cloudera-manager.repo.
    2. Edit the file to change baseurl to point to the version of Cloudera Manager you want to download. For example, to install Cloudera Manager version 5.0.1, change: baseurl=https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/ to baseurl=https://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5.0.1/.
    3. Save the edited file:
      • For RHEL or CentOS, save it in /etc/yum.repos.d/.
      • For SLES, save it in /etc/zypp/repos.d.
  • Download and Edit the cloudera.list file for Debian or Apt
    1. Download the Cloudera Manager list file (cloudera.list) using the links provided at Cloudera Manager Version and Download Information. For example, for Ubuntu 10.04 (lucid), this file is located at https://archive.cloudera.com/cm5/ubuntu/lucid/amd64/cm/cloudera.list.
    2. Edit the file to change the second-to-last element to specify the version of Cloudera Manager you want to install. For example, with Ubuntu lucid, if you want to install Cloudera Manager version 5.0.1, change: deb https://archive.cloudera.com/cm5/ubuntu/lucid/amd64/cm lucid-cm5 contrib to deb https://archive.cloudera.com/cm5/ubuntu/lucid/amd64/cm lucid-cm5.0.1 contrib.
    3. Save the edited file in the directory /etc/apt/sources.list.d/.

Install the Oracle JDK on the Cloudera Manager Server Host

Install the Oracle Java Development Kit (JDK) on the Cloudera Manager Server host. You can install the JDK from a repository, or you can download the JDK from Oracle and install it yourself:

  • Install the JDK from a repository
    The JDK is included in the Cloudera Manager 5 repositories. After downloading and editing the repo or list file, install the JDK as follows:
    OS Command
    RHEL
    $ sudo yum install oracle-j2sdk1.7
    SLES
    $ sudo zypper install oracle-j2sdk1.7
    Ubuntu or Debian
    $ sudo apt-get install oracle-j2sdk1.7
  • Install the JDK manually

    See Java Development Kit Installation.

Install the Cloudera Manager Server Packages

  1. Install the Cloudera Manager Server packages either on the host where the database is installed, or on a host that has access to the database. This host need not be a host in the cluster that you want to manage with Cloudera Manager. On the Cloudera Manager Server host, type the following commands to install the Cloudera Manager packages.
    OS Command
    RHEL, if you have a yum repo configured
    $ sudo yum install cloudera-manager-daemons cloudera-manager-server
    RHEL,if you're manually transferring RPMs
    $ sudo yum --nogpgcheck localinstall cloudera-manager-daemons-*.rpm
    $ sudo yum --nogpgcheck localinstall cloudera-manager-server-*.rpm
    SLES
    $ sudo zypper install cloudera-manager-daemons cloudera-manager-server 
    Ubuntu or Debian
    $ sudo apt-get install cloudera-manager-daemons cloudera-manager-server 
  2. If you choose an Oracle database for use with Cloudera Manager, edit the /etc/default/cloudera-scm-server file on the Cloudera Manager server host. Locate the line that begins with export CM_JAVA_OPTS and change the -Xmx2G option to -Xmx4G.

Set up a Database for the Cloudera Manager Server

Depending on whether you are using an external database, or the embedded PostgreSQL database, do one of the following:

(Optional) Manually Install the Oracle JDK, Cloudera Manager Agent, and CDH and Managed Service Packages

You can use Cloudera Manager to install the Oracle JDK, Cloudera Manager Agent packages, CDH, and managed service packages or you can install any of these packages manually. To use Cloudera Manager to install the packages, you must meet the requirements described in Cloudera Manager Deployment.

If you are going to use Cloudera Manager to install all of the software, skip this section and continue with Start the Cloudera Manager Server. Otherwise, to manually install the Oracle JDK, Cloudera Manager Agent, and CDH and Managed Services, continue with the procedures linked below and then return to this page to continue the installation. in this section. You can choose to manually install any of the following software and, in a later step, Cloudera Manager installs any software that you do not install manually:

Manually Install the Oracle JDK

You can use Cloudera Manager to install the Oracle JDK on all cluster hosts or you can install the JDKs manually. If you choose to have Cloudera Manager install the JDKs, skip this section. To use Cloudera Manager to install the JDK, you must meet the requirements described in Cloudera Manager Deployment.

Install the Oracle JDK on every cluster hosts. For more information, see Java Development Kit Installation.

Manually Install Cloudera Manager Agent Packages

The Cloudera Manager Agent is responsible for starting and stopping processes, unpacking configurations, triggering installations, and monitoring all hosts in a cluster. You can install the Cloudera Manager agent manually on all hosts, or Cloudera Manager can install the Agents in a later step. To use Cloudera Manager to install the agents, skip this section and continue with

To install the Cloudera Manager Agent packages manually, do the following on every cluster host (including those that will run one or more of the Cloudera Management Service roles: Service Monitor, Activity Monitor, Event Server, Alert Publisher, or Reports Manager):
  1. Use one of the following commands to install the Cloudera Manager Agent packages:
    OS Command
    RHEL, if you have a yum repo configured:
    $ sudo yum install cloudera-manager-agent cloudera-manager-daemons
    RHEL, if you're manually transferring RPMs:
    $ sudo yum --nogpgcheck localinstall cloudera-manager-agent-package.*.x86_64.rpm cloudera-manager-daemons
    SLES
    $ sudo zypper install cloudera-manager-agent cloudera-manager-daemons
    Ubuntu or Debian
    $ sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
  2. On every cluster host, configure the Cloudera Manager Agent to point to the Cloudera Manager Server by setting the following properties in the /etc/cloudera-scm-agent/config.ini configuration file:
    Property Description
    server_host Name of the host where Cloudera Manager Server is running.
    server_port Port on the host where Cloudera Manager Server is running.
    For more information on Agent configuration options, see Agent Configuration File.
  3. Start the Agents by running the following command on all hosts:
    sudo service cloudera-scm-agent start

When the Agent starts, it contacts the Cloudera Manager Server. If communication fails between a Cloudera Manager Agent and Cloudera Manager Server, see Troubleshooting Installation and Upgrade Problems. When the Agent hosts reboot, cloudera-scm-agent starts automatically.

Manually Install CDH and Managed Service Packages

The CDH and Managed Service Packages contain all of the CDH software. You can choose to manually install CDH and the Managed Service Packages, or you can choose to let Cloudera Manager perform this installation in a later step. To use Cloudera Manager perform the installation, continue with Start the Cloudera Manager Server. Otherwise, follow the steps in (Optional) Manually Install CDH and Managed Service Packages and then return to this page to continue the installation.

Install CDH and Managed Service Packages

Choose a Repository Strategy

To install CDH and Managed Service Packages, choose one of the following repository strategies:

  • Standard Cloudera repositories. For this method, ensure you have added the required repository information to your systems.
  • Internally hosted repositories. You might use internal repositories for environments where hosts do not have access to the Internet. For information about preparing your environment, see Understanding Custom Installation Solutions. When using an internal repository, you must copy the repo or list file to the Cloudera Manager Server host and update the repository properties to point to internal repository URLs.

Install CDH 5 and Managed Service Packages

Install the packages on all cluster hosts using the following steps:

  • Red Hat
    1. Download and install the "1-click Install" package.
      1. Download the CDH 5 "1-click Install" package (or RPM).

        Click the appropriate RPM and Save File to a directory with write access (for example, your home directory).

        OS Version Link to CDH 5 RPM
        RHEL/CentOS/Oracle 5 RHEL/CentOS/Oracle 5 link
        RHEL/CentOS/Oracle 6 RHEL/CentOS/Oracle 6 link
        RHEL/CentOS/Oracle 7 RHEL/CentOS/Oracle 7 link
      2. Install the RPM for all RHEL versions:
        $ sudo yum --nogpgcheck localinstall cloudera-cdh-5-0.x86_64.rpm 
    2. (Optionally) add a repository key:
      • Red Hat/CentOS/Oracle 5
        $ sudo rpm --import https://archive.cloudera.com/cdh5/redhat/5/x86_64/cdh/RPM-GPG-KEY-cloudera
      • Red Hat/CentOS/Oracle 6
        $ sudo rpm --import https://archive.cloudera.com/cdh5/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
    3. Install the CDH packages:
      $ sudo yum clean all
      $ sudo yum install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-python sqoop sqoop2 whirr
  • SLES
    1. Download and install the "1-click Install" package.
      1. Download the CDH 5 "1-click Install" package.

        Download the rpm file, choose Save File, and save it to a directory to which you have write access (for example, your home directory).

      2. Install the RPM:
        $ sudo rpm -i cloudera-cdh-5-0.x86_64.rpm
      3. Update your system package index by running:
        $ sudo zypper refresh
    2. (Optionally) add a repository key:
      $ sudo rpm --import https://archive.cloudera.com/cdh5/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera
    3. Install the CDH packages:
      $ sudo zypper clean --all
      $ sudo zypper install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-python sqoop sqoop2 whirr
  • Ubuntu and Debian
    1. Download and install the "1-click Install" package
      1. Download the CDH 5 "1-click Install" package:
        OS Version Package Link
        Jessie Jessie package
        Wheezy Wheezy package
        Precise Precise package
        Trusty Trusty package
      2. Install the package by doing one of the following:
        • Choose Open with in the download window to use the package manager.
        • Choose Save File, save the package to a directory to which you have write access (for example, your home directory), and install it from the command line. For example:
          sudo dpkg -i cdh5-repository_1.0_all.deb
    2. Optionally add a repository key:
      • Debian Wheezy
        $ curl -s https://archive.cloudera.com/cdh5/debian/wheezy/amd64/cdh/archive.key | sudo apt-key add -
      • Ubuntu Precise
        $ curl -s https://archive.cloudera.com/cdh5/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
    3. Install the CDH packages:
      $ sudo apt-get update
      $ sudo apt-get install avro-tools crunch flume-ng hadoop-hdfs-fuse hadoop-hdfs-nfs3 hadoop-httpfs hadoop-kms hbase-solr hive-hbase hive-webhcat hue-beeswax hue-hbase hue-impala hue-pig hue-plugins hue-rdbms hue-search hue-spark hue-sqoop hue-zookeeper impala impala-shell kite llama mahout oozie pig pig-udf-datafu search sentry solr-mapreduce spark-core spark-master spark-worker spark-python sqoop sqoop2 whirr

Install CDH 4, Impala, and Solr Managed Service Packages

Install the packages on all cluster hosts using the following steps:

  • RHEL-compatible
    1. Click the entry in the table at CDH Download Information that matches your RHEL or CentOS system.
    2. Go to the repo file (cloudera-cdh4.repo) for your system and save it in the /etc/yum.repos.d/ directory.
    3. Optionally add a repository key:
      • RHEL/CentOS/Oracle 5
        $ sudo rpm --import https://archive.cloudera.com/cdh4/redhat/5/x86_64/cdh/RPM-GPG-KEY-cloudera
      • RHEL/CentOS 6
        $ sudo rpm --import https://archive.cloudera.com/cdh4/redhat/6/x86_64/cdh/RPM-GPG-KEY-cloudera
    4. Install packages on every host in your cluster:
      1. Install CDH 4 packages:
        $ sudo yum -y install bigtop-utils bigtop-jsvc bigtop-tomcat hadoop hadoop-hdfs hadoop-httpfs hadoop-mapreduce hadoop-yarn hadoop-client hadoop-0.20-mapreduce hue-plugins hbase hive oozie oozie-client pig zookeeper
      2. To install the hue-common package and all Hue applications on the Hue host, install the hue meta-package:
        $ sudo yum install hue 
    5. (Requires CDH 4.2 and higher) Install Impala
      1. In the table at Cloudera Impala Version and Download Information, click the entry that matches your RHEL or CentOS system.
      2. Go to the repo file for your system and save it in the /etc/yum.repos.d/ directory.
      3. Install Impala and the Impala Shell on Impala machines:
        $ sudo yum -y install impala impala-shell
    6. (Requires CDH 4.3 and higher) Install Search
      1. In the table at Cloudera Search Version and Download Information, click the entry that matches your RHEL or CentOS system.
      2. Go to the repo file for your system and save it in the /etc/yum.repos.d/ directory.
      3. Install the Solr Server on machines where you want Cloudera Search.
        $ sudo yum -y install solr-server
  • SLES
    1. Run the following command:
      $ sudo zypper addrepo -f https://archive.cloudera.com/cdh4/sles/11/x86_64/cdh/cloudera-cdh4.repo
    2. Update your system package index by running:
      $ sudo zypper refresh
    3. Optionally add a repository key:
      $ sudo rpm --import https://archive.cloudera.com/cdh4/sles/11/x86_64/cdh/RPM-GPG-KEY-cloudera  
    4. Install packages on every host in your cluster:
      1. Install CDH 4 packages:
        $ sudo zypper install bigtop-utils bigtop-jsvc bigtop-tomcat hadoop hadoop-hdfs hadoop-httpfs hadoop-mapreduce hadoop-yarn hadoop-client hadoop-0.20-mapreduce hue-plugins hbase hive oozie oozie-client pig zookeeper
      2. To install the hue-common package and all Hue applications on the Hue host, install the hue meta-package:
        $ sudo zypper install hue 
      3. (Requires CDH 4.2 and higher) Install Impala
        1. Run the following command:
          $ sudo zypper addrepo -f https://archive.cloudera.com/impala/sles/11/x86_64/impala/cloudera-impala.repo
        2. Install Impala and the Impala Shell on Impala machines:
          $ sudo zypper install impala impala-shell
      4. (Requires CDH 4.3 and higher) Install Search
        1. Run the following command:
          $ sudo zypper addrepo -f https://archive.cloudera.com/search/sles/11/x86_64/search/cloudera-search.repo
        2. Install the Solr Server on machines where you want Cloudera Search.
          $ sudo zypper install solr-server
  • Ubuntu or Debian
    1. In the table at CDH Version and Packaging Information, click the entry that matches your Ubuntu or Debian system.
    2. Go to the list file (cloudera.list) for your system and save it in the /etc/apt/sources.list.d/ directory. For example, to install CDH 4 for 64-bit Ubuntu Lucid, your cloudera.list file should look like:
      deb [arch=amd64] https://archive.cloudera.com/cdh4/ubuntu/lucid/amd64/cdh lucid-cdh4 contrib
      deb-src https://archive.cloudera.com/cdh4/ubuntu/lucid/amd64/cdh lucid-cdh4 contrib
    3. Optionally add a repository key:
      • Ubuntu Lucid
        $ curl -s https://archive.cloudera.com/cdh4/ubuntu/lucid/amd64/cdh/archive.key | sudo apt-key add -
      • Ubuntu Precise
        $ curl -s https://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -
      • Debian Squeeze
        $ curl -s https://archive.cloudera.com/cdh4/debian/squeeze/amd64/cdh/archive.key | sudo apt-key add -
    4. Install packages on every host in your cluster:
      1. Install CDH 4 packages:
        $ sudo apt-get install bigtop-utils bigtop-jsvc bigtop-tomcat hadoop hadoop-hdfs hadoop-httpfs hadoop-mapreduce hadoop-yarn hadoop-client hadoop-0.20-mapreduce hue-plugins hbase hive oozie oozie-client pig zookeeper
      2. To install the hue-common package and all Hue applications on the Hue host, install the hue meta-package:
        $ sudo apt-get install hue 
      3. (Requires CDH 4.2 and higher) Install Impala
        1. In the table at Cloudera Impala Version and Download Information, click the entry that matches your Ubuntu or Debian system.
        2. Go to the list file for your system and save it in the /etc/apt/sources.list.d/ directory.
        3. Install Impala and the Impala Shell on Impala machines:
          $ sudo apt-get install impala impala-shell
      4. (Requires CDH 4.3 and higher) Install Search
        1. In the table at Cloudera Search Version and Download Information, click the entry that matches your Ubuntu or Debian system.
        2. Install Solr Server on machines where you want Cloudera Search:
          $ sudo apt-get install solr-server

Start the Cloudera Manager Server

  1. Run this command on the Cloudera Manager Server host:
    sudo service cloudera-scm-server start

    If the Cloudera Manager Server does not start, see Troubleshooting Installation and Upgrade Problems.

Start the Cloudera Manager Agents

If you using Cloudera Manager to install the Cloudera Manager Agent packages, skip this section. Otherwise, run the following command on each Agent host:
sudo service cloudera-scm-agent start
When the Agent starts, it contacts the Cloudera Manager Server. If communication fails between a Cloudera Manager Agent and Cloudera Manager Server, see Troubleshooting Installation and Upgrade Problems.

When the Agent hosts reboot, cloudera-scm-agent starts automatically.

Start and Log into the Cloudera Manager Admin Console

The Cloudera Manager Server URL takes the following form http://Server host:port, where Server host is the fully qualified domain name or IP address of the host where the Cloudera Manager Server is installed, and port is the port configured for the Cloudera Manager Server. The default port is 7180.
  1. Wait several minutes for the Cloudera Manager Server to start. To observe the startup process, run tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log on the Cloudera Manager Server host. If the Cloudera Manager Server does not start, see Troubleshooting Installation and Upgrade Problems.
  2. In a web browser, enter http://Server host:7180, where Server host is the fully qualified domain name or IP address of the host where the Cloudera Manager Server is running.

    The login screen for Cloudera Manager Admin Console displays.

  3. Log into Cloudera Manager Admin Console. The default credentials are: Username: admin Password: admin. Cloudera Manager does not support changing the admin username for the installed account. You can change the password using Cloudera Manager after you run the installation wizard. Although you cannot change the admin username, you can add a new user, assign administrative privileges to the new user, and then delete the default admin account.
  4. After logging in, the Cloudera Manager End User License Terms and Conditions page displays. Read the terms and conditions and then select Yes to accept them.
  5. Click Continue.

    The Welcome to Cloudera Manager page displays.

Choose Cloudera Manager Edition

From the Welcome to Cloudera Manager page, you can select the edition of Cloudera Manager to install and, optionally, install a license:

  1. Choose which edition to install:
    • Cloudera Express, which does not require a license, but provides a limited set of features.
    • Cloudera Enterprise Enterprise Data Hub Edition Trial, which does not require a license, but expires after 60 days and cannot be renewed.
    • Cloudera Enterprise with one of the following license types:
      • Basic Edition
      • Flex Edition
      • Enterprise Data Hub Edition
    If you choose Cloudera Express or Cloudera Enterprise Enterprise Data Hub Edition Trial, you can upgrade the license at a later time. See Managing Licenses.
  2. If you elect Cloudera Enterprise, install a license:
    1. Click Upload License.
    2. Click the document icon to the left of the Select a License File text field.
    3. Go to the location of your license file, click the file, and click Open.
    4. Click Upload.
  3. Information is displayed indicating what the CDH installation includes. At this point, you can click the Support drop-down menu to access online Help or the Support Portal.
  4. Click Continue to proceed with the installation.

Choose Cloudera Manager Hosts

Choose which hosts will run CDH and managed services

  1. Do one of the following depending on whether you are using Cloudera Manager to install software:
    • If you are using Cloudera Manager to install software, search for and choose hosts:
      1. To enable Cloudera Manager to automatically discover hosts on which to install CDH and managed services, enter the cluster hostnames or IP addresses. You can also specify hostname and IP address ranges. For example:
        Range Definition Matching Hosts
        10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
        host[1-3].company.com host1.company.com, host2.company.com, host3.company.com
        host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com

        You can specify multiple addresses and address ranges by separating them with commas, semicolons, tabs, or blank spaces, or by placing them on separate lines. Use this technique to make more specific searches instead of searching overly wide ranges. The scan results will include all addresses scanned, but only scans that reach hosts running SSH will be selected for inclusion in your cluster by default. If you do not know the IP addresses of all of the hosts, you can enter an address range that spans over unused addresses and then clear the hosts that do not exist (and are not discovered) later in this procedure. However, keep in mind that wider ranges will require more time to scan.

      2. Click Search. Cloudera Manager identifies the hosts on your cluster to allow you to configure them for services. If there are a large number of hosts on your cluster, wait a few moments to allow them to be discovered and shown in the wizard. If the search is taking too long, you can stop the scan by clicking Abort Scan. To find additional hosts, click New Search, add the host names or IP addresses and click Search again. Cloudera Manager scans hosts by checking for network connectivity. If there are some hosts where you want to install services that are not shown in the list, make sure you have network connectivity between the Cloudera Manager Server host and those hosts. Common causes of loss of connectivity are firewalls and interference from SELinux.
      3. Verify that the number of hosts shown matches the number of hosts where you want to install services. Clear host entries that do not exist and clear the hosts where you do not want to install services.
    • If you installed Cloudera Agent packages in Manually Install Cloudera Manager Agent Packages, choose from among hosts with the packages installed:
      1. Click the Currently Managed Hosts tab.
      2. Choose the hosts to add to the cluster.
  2. Click Continue.

    The Cluster Installation Select Repository screen displays.

Choose the Software Installation Type and Install Software

Choose a software installation type (parcels or packages) and install the software. If you have already installed the CDH and Managed Service packages, you cannot choose Parcel installation.

  1. Choose the software installation type and CDH and managed service version:
    • Use Parcels
      1. Choose the parcels to install. The choices depend on the repositories you have chosen; a repository can contain multiple parcels. Only the parcels for the latest supported service versions are configured by default.
        You can add additional parcels for lower versions by specifying custom repositories. For example, you can find the locations of the lower CDH 4 parcels at https://username:password@archive.cloudera.com/p/cdh4/parcels/. Or, if you are installing CDH 4.3 and want to use policy-file authorization, you can add the Sentry parcel using this mechanism.
        1. To specify the parcel directory, specify the local parcel repository, add a parcel repository, or specify the properties of a proxy server through which parcels are downloaded, click the More Options button and do one or more of the following:
          • Parcel Directory and Local Parcel Repository Path - Specify the location of parcels on cluster hosts and the Cloudera Manager Server host. If you change the default value for Parcel Directory and have already installed and started Cloudera Manager Agents, restart the Agents:
            sudo service cloudera-scm-agent restart
          • Parcel Repository - In the Remote Parcel Repository URLs field, click the button and enter the URL of the repository. The URL you specify is added to the list of repositories listed in the Configuring Cloudera Manager Server Parcel Settings page and a parcel is added to the list of parcels on the Select Repository page. If you have multiple repositories configured, you see all the unique parcels contained in all your repositories.
          • Proxy Server - Specify the properties of a proxy server.
        2. Click OK.
      2. If you are using Cloudera Manager to install software, select the release of Cloudera Manager Agent. You can choose either the version that matches the Cloudera Manager Server you are currently using or specify a version in a custom repository. If you opted to use custom repositories for installation files, you can provide a GPG key URL that applies for all repositories.
    • Use Packages (not supported when using DSSD DataNodes) - Do one of the following:
      • If Cloudera Manager is installing the packages:
        1. Click the package version.
        2. If you are using Cloudera Manager to install software, select the release of Cloudera Manager Agent. You can choose either the version that matches the Cloudera Manager Server you are currently using or specify a version in a custom repository. If you opted to use custom repositories for installation files, you can provide a GPG key URL that applies for all repositories.
      • If you manually installed packages in Manually Install CDH and Managed Service Packages , select the CDH version (CDH 4 or CDH 5) that matches the packages you installed manually.
  2. If you installed the Agent and JDK manually on all cluster hosts:
    • Click Continue.

      The Host Inspector runs to validate the installation and provides a summary of what it finds, including all the versions of the installed components. If the validation is successful, click Finish.

    • Skip the remaining steps in this section and continue with Add Services
  3. Select Install Oracle Java SE Development Kit (JDK) to allow Cloudera Manager to install the JDK on each cluster host. If you have already installed the JDK, do not select this option. If your local laws permit you to deploy unlimited strength encryption, and you are running a secure cluster, select the Install Java Unlimited Strength Encryption Policy Files checkbox.
  4. (Optional) Select Single User Mode to configure the Cloudera Manager Agent and all service processes to run as the same user. This mode requires extra configuration steps that must be done manually on all hosts in the cluster. If you have not performed the steps, directory creation will fail in the installation wizard. In most cases, you can create the directories but the steps performed by the installation wizard may have to be continued manually. Click Continue.
  5. If you chose to have Cloudera Manager install software, specify host installation properties:
    • Select root or enter the username for an account that has password-less sudo permission.
    • Select an authentication method:
      • If you choose password authentication, enter and confirm the password.
      • If you choose public-key authentication, provide a passphrase and path to the required key files.
    • You can specify an alternate SSH port. The default value is 22.
    • You can specify the maximum number of host installations to run at once. The default value is 10.

    The root password (or any password used at this step) is not saved in Cloudera Manager or CDH. You can change these passwords after install without any impact to Cloudera Manager or CDH.

  6. Click Continue. If you chose to have Cloudera Manager install software, Cloudera Manager installs the Oracle JDK, Cloudera Manager Agent, packages and CDH and managed service parcels or packages. During parcel installation, progress is indicated for the phases of the parcel installation process in separate progress bars. If you are installing multiple parcels, you see progress bars for each parcel. When the Continue button at the bottom of the screen turns blue, the installation process is completed.
  7. Click Continue.

    The Host Inspector runs to validate the installation and provides a summary of what it finds, including all the versions of the installed components. If the validation is successful, click Finish.

Add Services

  1. In the first page of the Add Services wizard, choose the combination of services to install and whether to install Cloudera Navigator:
    • Select the combination of services to install:
      CDH 4 CDH 5
      • Core Hadoop - HDFS, MapReduce, ZooKeeper, Oozie, Hive, and Hue
      • Core with HBase
      • Core with Impala
      • All Services - HDFS, MapReduce, ZooKeeper, HBase, Impala, Oozie, Hive, Hue, and Sqoop
      • Custom Services - Any combination of services.
      • Core Hadoop - HDFS, YARN (includes MapReduce 2), ZooKeeper, Oozie, Hive, and Hue
      • Core with HBase
      • Core with Impala
      • Core with Search
      • Core with Spark
      • All Services - HDFS, YARN (includes MapReduce 2), ZooKeeper, Oozie, Hive, Hue, HBase, Impala, Solr, Spark, and Key-Value Store Indexer
      • Custom Services - Any combination of services.
      As you select services, keep the following in mind:
      • Some services depend on other services; for example, HBase requires HDFS and ZooKeeper. Cloudera Manager tracks dependencies and installs the correct combination of services.
      • In a Cloudera Manager deployment of a CDH 4 cluster, the MapReduce service is the default MapReduce computation framework. Choose Custom Services to install YARN, or use the Add Service functionality to add YARN after installation completes.
      • In a Cloudera Manager deployment of a CDH 5 cluster, the YARN service is the default MapReduce computation framework. Choose Custom Services to install MapReduce, or use the Add Service functionality to add MapReduce after installation completes.
      • The Flume service can be added only after your cluster has been set up.
    • If you have chosen Enterprise Data Hub Edition Trial or Cloudera Enterprise, optionally select the Include Cloudera Navigator checkbox to enable Cloudera Navigator. See Cloudera Navigator 2 Overview.
  2. Click Continue.
  3. Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to which the HDFS DataNode role is assigned. You can reassign role instances if necessary.

    Click a field below a role to display a dialog box containing a list of hosts. If you click a field containing multiple hosts, you can also select All Hosts to assign the role to all hosts, or Custom to display the pageable hosts dialog box.

    The following shortcuts for specifying hostname patterns are supported:
    • Range of hostnames (without the domain portion)
      Range Definition Matching Hosts
      10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4
      host[1-3].company.com host1.company.com, host2.company.com, host3.company.com
      host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com
    • IP addresses
    • Rack name

    Click the View By Host button for an overview of the role assignment by hostname ranges.

  4. When you are satisfied with the assignments, click Continue.

Configure Database Settings

On the Database Setup page, configure settings for required databases:
  1. Enter the database host, database type, database name, username, and password for the database that you created when you set up the database.
  2. Click Test Connection to confirm that Cloudera Manager can communicate with the database using the information you have supplied. If the test succeeds in all cases, click Continue; otherwise, check and correct the information you have provided for the database and then try the test again. (For some servers, if you are using the embedded database, you will see a message saying the database will be created at a later step in the installation process.)

    The Review Changes screen displays.

Review Configuration Changes and Start Services

  1. Review the configuration changes to be applied. Confirm the settings entered for file system paths. The file paths required vary based on the services to be installed. If you chose to add the Sqoop service, indicate whether to use the default Derby database or the embedded PostgreSQL database. If the latter, type the database name, host, and user credentials that you specified when you created the database.
  2. Click Continue.

    The wizard starts the services.

  3. When all of the services are started, click Continue. You see a success message indicating that your cluster has been successfully started.
  4. Click Finish to proceed to the Cloudera Manager Admin Console Home Page.

Change the Default Administrator Password

As soon as possible, change the default administrator password:
  1. Click the logged-in username at the far right of the top navigation bar and select Change Password.
  2. Enter the current password and a new password twice, and then click OK.

Test the Installation

You can test the installation following the instructions in Testing the Installation.