Installing Cloudera Manager and CDH
- Cloudera Manager, CDH, and managed services in a Cloudera Manager deployment. This is the recommended method for installing CDH and managed services.
- CDH 5 into an unmanaged deployment.
- Cloudera Manager and CDH using the EMC DSSD D5 storage appliance as the storage for Hadoop DataNodes. See Installation and Upgrade with the EMC DSSD D5.
Cloudera Manager Deployment
- Oracle JDK
- Cloudera Manager Server and Agent packages
- Supporting database software
- CDH and managed service software
- Demonstration and proof of concept deployments - There are three installation options:
- Installation Path A - Automated Installation by Cloudera Manager (Non-Production Mode) - Cloudera Manager automates the installation of the Oracle JDK, Cloudera Manager Server, embedded PostgreSQL database, Cloudera Manager Agent, CDH, and managed
service software on cluster hosts. Cloudera Manager also configures databases for the Cloudera Manager Server and Hive Metastore and optionally for Cloudera Management Service roles. This path is
recommended for demonstration and proof-of-concept deployments, but is not recommended for production deployments because its not intended to scale and may require database
migration as your cluster grows. To use this method, server and cluster hosts must satisfy the following requirements:
- Provide the ability to log in to the Cloudera Manager Server host using a root account or an account that has password-less sudo permission.
- Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See CDH and Cloudera Manager Networking and Security Requirements for further information.
- All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the required installation files.
- Installation Path B - Installation Using Cloudera Manager Parcels or Packages - you install the Oracle JDK, Cloudera Manager
Server, and embedded PostgreSQL database packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on
cluster hosts: manually install it yourself or use Cloudera Manager to automate installation.
In order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
- Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See CDH and Cloudera Manager Networking and Security Requirements for further information.
- All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the required installation files.
- Installation Path A - Automated Installation by Cloudera Manager (Non-Production Mode) - Cloudera Manager automates the installation of the Oracle JDK, Cloudera Manager Server, embedded PostgreSQL database, Cloudera Manager Agent, CDH, and managed
service software on cluster hosts. Cloudera Manager also configures databases for the Cloudera Manager Server and Hive Metastore and optionally for Cloudera Management Service roles. This path is
recommended for demonstration and proof-of-concept deployments, but is not recommended for production deployments because its not intended to scale and may require database
migration as your cluster grows. To use this method, server and cluster hosts must satisfy the following requirements:
- Production deployments - require you to first manually install and configure a production database for the Cloudera Manager Server and Hive Metastore. There are two installation options:
- Installation Path B - Installation Using Cloudera Manager Parcels or Packages - you install the Oracle JDK and Cloudera Manager
Server packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it
yourself or use Cloudera Manager to automate installation.
In order for Cloudera Manager to automate installation of Cloudera Manager Agent packages or CDH and managed service software, cluster hosts must satisfy the following requirements:
- Allow the Cloudera Manager Server host to have uniform SSH access on the same port to all hosts. See CDH and Cloudera Manager Networking and Security Requirements for further information.
- All hosts must have access to standard package repositories and either archive.cloudera.com or a local repository with the required installation files.
- Installation Path C - Manual Installation Using Cloudera Manager Tarballs - you install the Oracle JDK, Cloudera Manager Server, and Cloudera Manager Agent software using tarballs and use Cloudera Manager to automate installation of CDH and managed service software as parcels.
- Installation Path B - Installation Using Cloudera Manager Parcels or Packages - you install the Oracle JDK and Cloudera Manager
Server packages on the Cloudera Manager Server host. You have two options for installing Oracle JDK, Cloudera Manager Agent, CDH, and managed service software on cluster hosts: manually install it
yourself or use Cloudera Manager to automate installation.
Cloudera Manager Installation Phases
The following table describes the phases of installing Cloudera Manager and a Cloudera Manager deployment of CDH and managed services. Every phase is required, but you can accomplish each phase in multiple ways, depending on your organization's policies and requirements. The six phases are grouped into three installation paths based on how the Cloudera Manager Server and database software are installed on the Cloudera Manager Server and cluster hosts. The criteria for choosing an installation path are discussed in Cloudera Manager Deployment.
Phase | |||
---|---|---|---|
Phase 1: Install JDK
Install the JDK required by Cloudera Manager Server, Management Service, and CDH. |
There are two options:
|
||
Phase 2: Set up Databases
Install, configure, and start the databases that are required by the Cloudera Manager Server, Cloudera Management Service, and that are optional for some CDH services. |
There are two options:
|
||
Path A | Path B | Path C | |
Phase 3: Install Cloudera Manager Server
Install and start Cloudera Manager Server on one host. |
Use the Cloudera Manager Installer to install its packages and the server. Requires Internet access and sudo privileges on the host. | Use Linux package install commands (like yum) to install Cloudera Manager Server.
Update database properties. Use service commands to start Cloudera Manager Server. |
Use Linux commands to unpack tarballs and service commands to start the server. |
Phase 4: Install Cloudera Manager Agents
Install and start the Cloudera Manager Agent on all hosts. |
Use the Cloudera Manager Installation wizard to install the Agents on all hosts. | There are two options:
|
Use Linux commands to unpack tarballs and service commands to start the agents on all hosts. |
Phase 5: Install CDH and Managed Service software
Install, configure, and start CDH and managed services on all hosts. |
Use the Cloudera Manager Installation wizard to install CDH and other managed services. | There are two options:
|
Use Linux commands to unpack tarballs and service commands to start CDH and managed services on all hosts. |
Phase 6: Create, Configure and Start CDH and Managed Services
Configure and start CDH and managed services. |
Use the Cloudera Manager Installation wizard to install CDH and other managed services, assign roles to hosts, and configure the cluster. Many configurations are automated. | Use the Cloudera Manager Installation wizard to install CDH and other managed services, assign roles to hosts, and configure the cluster. Many configurations are automated. | Use the Cloudera Manager Installation wizard to install CDH and other managed services, assign roles to hosts, and
configure the cluster. Many configurations are automated.
You can also use the Cloudera Manager API to manage a cluster, which can be useful for scripting preconfigured deployments. |
Cloudera Manager Installation Software
- Installation path A (non-production) - A small self-executing Cloudera Manager installation program to install the
Cloudera Manager Server and other packages. The Cloudera Manager installer, which you install on the host where you want the Cloudera Manager Server to run, performs the following:
- Installs the package repositories for Cloudera Manager and the Oracle Java Development Kit (JDK).
- Installs the Cloudera Manager packages.
- Installs and configures an embedded PostgreSQL database for use by the Cloudera Manager Server, some Cloudera Management Service roles, some managed services, and Cloudera Navigator roles.
- Installation paths B and C - Cloudera Manager package repositories for manually installing the Cloudera Manager Server, Agent, and embedded database packages.
- Installation path B - The Cloudera Manager Installation wizard for automating installation of Cloudera Manager Agent package.
- All installation paths - The Cloudera Manager Installation wizard for automating CDH and managed service installation
and configuration on the cluster hosts. Cloudera Manager provides two methods for installing CDH and managed services: parcels and packages. Parcels simplify the installation process and allow you to
download, distribute, and activate new versions of CDH and managed services from within Cloudera Manager. After you install Cloudera Manager and connect to the Cloudera Manager Admin Console for the
first time, use the Cloudera Manager Installation wizard to:
- Discover cluster hosts.
- Optionally install the Oracle JDK.
- Optionally install CDH, managed service, and Cloudera Manager Agent software on cluster hosts.
- Select services.
- Map service roles to hosts.
- Edit service configurations.
- Start services.
Unmanaged Deployment
In an deployment not managed by Cloudera Manager, you are responsible for managing all phases of the lifecycle of CDH and managed service components on each host: installation, configuration, and service lifecycle operations such as start and stop. This section describes alternatives for installing CDH 5 software in an unmanaged deployment.
- Command-line methods:
- Download and install the CDH 5 "1-click Install" package
- Add the CDH 5 repository
- Build your own CDH 5 repository
- Tarball You can download a tarball from CDH
downloads. Keep the following points in mind:
- Installing CDH 5 from a tarball installs YARN.
- In CDH 5, there is no separate tarball for MRv1. Instead, the MRv1 binaries, examples, and so on, are delivered in the Hadoop tarball. The scripts for running MRv1 are in the bin-mapreduce1 directory in the tarball, and the MRv1 examples are in the examples-mapreduce1 directory.