Installation Path C - Manual Installation Using Cloudera Manager Tarballs
Before proceeding with this path for a new installation, review Choosing an Installation Path. If you are upgrading an Cloudera Manager existing installation, see Upgrading Cloudera Manager.
To avoid using system packages, and to use tarballs and parcels instead, follow the instructions in this section.
- Before You Begin
- Install the Cloudera Manager Server and Agents
- Configure a Database for the Cloudera Manager Server
- Create a Parcel Repository Directory
- Start the Cloudera Manager Server
- Start the Cloudera Manager Agents
- Start the Cloudera Manager Admin Console
- Choose Cloudera Manager Edition and Hosts
- Choose Software Installation Method and Install Software
- Add Services
- (Optional) Change the Cloudera Manager User
- Change the Default Administrator Password
- Test the Installation
Before You Begin
Install and Configure Databases
Cloudera Manager Server, Cloudera Management Service, and the Hive Metastore data is stored in a database. Install and configure required databases following the instructions in Cloudera Manager and Managed Service Databases.
(CDH 5 only) On RHEL and CentOS 5, Install Python 2.6 or 2.7
Python 2.6 or 2.7 is required to run Hue. RHEL 5 and CentOS 5, in particular, require the EPEL repository package.
$ su -c 'rpm -Uvh http://download.fedoraproject.org/pub/epel/5/i386/epel-release-5-4.noarch.rpm' ... $ yum install python26
Install the Cloudera Manager Server and Agents
$ sudo mkdir /opt/cloudera-manager
$ sudo tar xzf cloudera-manager*.tar.gz -C /opt/cloudera-manager
The files are extracted to a subdirectory named according to the Cloudera Manager version being extracted. For example, files could extract to /opt/cloudera-manager/cm-5.0/. This full path is needed later and is referred to as tarball_root directory.
Create Users
The Cloudera Manager Server and managed services need a user account to complete tasks. When installing Cloudera Manager from tarballs, you much create this user account on all hosts manually. Because Cloudera Manager Server and managed services are configured to use the user account cloudera-scm by default, creating a user with this name is the simplest approach. After creating such a user, it is automatically used after installation is complete.
$ sudo useradd --system --home=/opt/cloudera-manager/cm-5.0/run/cloudera-scm-server --no-create-home --shell=/bin/false --comment "Cloudera SCM User" cloudera-scmFor the preceding useradd command, ensure the --home argument path matches your environment. This argument varies according to where you place the tarball and the version number varies among releases. For example, the --home location could be /opt/cm-5.0/run/cloudera-scm-server.
Configure Cloudera Manager Agents
On every Cloudera Manager Agent host, configure the Cloudera Manager Agent to point to the Cloudera Manager Server by setting the following properties in the tarball_root/etc/cloudera-scm-agent/config.ini configuration file:Property | Description |
---|---|
server_host | Name of host where the Cloudera Manager Server is running. |
server_port | Port on host where the Cloudera Manager Server is running. |
Custom Cloudera Manager Users and Directories
- /var/log/cloudera-scm-headlamp
- /var/log/cloudera-scm-firehose
- /var/log/cloudera-scm-alertpublisher
- /var/log/cloudera-scm-eventserver
- /var/lib/cloudera-scm-headlamp
- /var/lib/cloudera-scm-firehose
- /var/lib/cloudera-scm-alertpublisher
- /var/lib/cloudera-scm-eventserver
Two ways to resolve such situations are: Changing the ownership of existing directories or specifying alternate directories for agents. You do not need to complete both procedures.
- Change the directory owner to the Cloudera Manager user. If the Cloudera Manager user and group are cloudera-scm and you needed to take ownership of the headlamp log directory, you would issue a command similar to the following:
$ sudo chown -R cloudera-scm:cloudera-scm /var/log/cloudera-scm-headlamp
- Repeat the process of using chown to change ownership for all existing directories to the Cloudera Manager user.
- If the directories you plan to use do not exist, create them now. For example to create /var/cm_logs/cloudera-scm-headlamp for use by the cloudera-scm user, you might use the following commands:
sudo mkdir /var/cm_logs/cloudera-scm-headlamp sudo chown cloudera-scm /var/cm_logs/cloudera-scm-headlamp
- Connect to the Cloudera Manager Admin Console.
- Under the Cloudera Managed Services, click the name of the service.
- In the service status page, click Configuration.
- In the settings page, enter a term in the Search field to find the settings to be change. For example, you might enter "/var" or "directory".
- Update each value with the new locations for Cloudera Manager to use.
- Click Save Changes.
Configure a Database for the Cloudera Manager Server
Set up the Cloudera Manager Server database as described in Setting up the Cloudera Manager Server Database.
Create a Parcel Repository Directory
- Create a parcel repository directory:
$ sudo mkdir -p /opt/cloudera/parcel-repo
- Change the directory ownership to be the username you are using to run Cloudera Manager:
$ sudo chown username:groupname /opt/cloudera/parcel-repo
where username and groupname are the user and group names (respectively) you are using to run Cloudera Manager. For example, if you use the default username cloudera-scm, you would give the command:$ sudo chown cloudera-scm:cloudera-scm /opt/cloudera/parcel-repo
Start the Cloudera Manager Server
- Shut down HDFS and MapReduce. See Stopping Services (for CDH 4) or Stopping Services (for CDH 5) for the commands to stop these services.
- Configure the init scripts to not start on boot, use commands similar to those shown in Configuring init to Start Core Hadoop System Services or Configuring init to Start Core Hadoop System Services but disable the start on boot (for example, $ sudo chkconfig hadoop-hdfs-namenode off).
- As root:
$ sudo tarball_root/etc/init.d/cloudera-scm-server start
- As another user. If you run as another user, ensure the user you created for Cloudera Manager owns the location to which you extracted the tarball including the newly created database files. If you followed the earlier examples and created the directory /opt/cloudera-manager and the user cloudera-scm, you could use the following command to change ownership of the directory:
$ sudo chown -R cloudera-scm:cloudera-scm /opt/cloudera-manager
Once you have established proper ownership of directory locations, you can start Cloudera Manager Server using the user account you chose. For example, you might run the Cloudera Manager Server as cloudera-service. In such a case there are following options:
- Run the following
command:
$ sudo -u user tarball_root/etc/init.d/cloudera-scm-server start
- Edit the configuration files so the script internally changes the
user, then run the script as root:
- Remove the following
line from tarball_root/etc/default/cloudera-scm-server:
export CMF_SUDO_CMD=" "
- Change the user and
group in tarball_root/etc/init.d/cloudera-scm-server
to the user you want the server to run as. For example, to
run as cloudera-service, change the user
and group as follows:
USER=cloudera-service GROUP=cloudera-service
- Run the server script as root:
$ sudo tarball_root/etc/init.d/cloudera-scm-server start
- Remove the following
line from tarball_root/etc/default/cloudera-scm-server:
- Run the following
command:
- To start the Cloudera Manager Server automatically after a reboot:
- Run the following commands on
the Cloudera Manager Server host:
- RHEL-compatible and
SLES
$ cp tarball_root/etc/init.d/cloudera-scm-server /etc/init.d/cloudera-scm-server $ chkconfig cloudera-scm-server on
- Debian/Ubuntu
$ cp tarball_root/etc/init.d/cloudera-scm-server /etc/init.d/cloudera-scm-server $ update-rc.d cloudera-scm-server defaults
- RHEL-compatible and
SLES
- On the Cloudera Manager Server host, open the /etc/init.d/cloudera-scm-server file and change the value of CMF_DEFAULTS from ${CMF_DEFAULTS:-/etc/default} to tarball_root/etc/default.
- Run the following commands on
the Cloudera Manager Server host:
Start the Cloudera Manager Agents
- To start the Cloudera Manager Agent, run this command on each Agent
host:
$ sudo tarball_root/etc/init.d/cloudera-scm-agent start
When the Agent starts, it contacts the Cloudera Manager Server. - To start the Cloudera Manager Agents automatically after a reboot:
- Run the following commands on each Agent host:
- RHEL-compatible and
SLES
$ cp tarball_root/etc/init.d/cloudera-scm-agent /etc/init.d/cloudera-scm-agent $ chkconfig cloudera-scm-agent on
- Debian/Ubuntu
$ cp tarball_root/etc/init.d/cloudera-scm-agent /etc/init.d/cloudera-scm-agent $ update-rc.d cloudera-scm-agent defaults
- RHEL-compatible and
SLES
- On each Agent, open the tarball_root/etc/init.d/cloudera-scm-agent file and change the value of CMF_DEFAULTS from ${CMF_DEFAULTS:-/etc/default} to tarball_root/etc/default.
- Run the following commands on each Agent host:
Start the Cloudera Manager Admin Console
- Wait several minutes for the Cloudera Manager Server to complete its startup. To observe the startup process you can perform tail -f /var/log/cloudera-scm-server/cloudera-scm-server.log on the Cloudera Manager Server host. If the Cloudera Manager Server does not start, see Troubleshooting Installation and Upgrade Problems.
- In a web browser, enter http://Server host:7180, where Server host is the fully-qualified domain name or IP address of the host where you installed the Cloudera Manager Server. The login screen for Cloudera Manager Admin Console displays.
- Log into Cloudera Manager Admin Console. The default credentials are: Username: admin Password: admin. Cloudera Manager does not support changing the admin username for the installed account. You can change the password using Cloudera Manager after you run the installation wizard. While you cannot change the admin username, you can add a new user, assign administrative privileges to the new user, and then delete the default admin account.
Choose Cloudera Manager Edition and Hosts
- When you start the Cloudera Manager Admin Console, the install wizard starts up. Click Continue to get started.
- Choose which edition to install:
- Cloudera Express, does not require a license, but provides a somewhat limited set of features.
- Cloudera Enterprise Data Hub Edition Trial, does not require a license, but expires after 60 days and cannot be renewed.
- Cloudera Enterprise with one of the following license
types:
- Basic Edition
- Flex Edition
- Data Hub Edition
- If you have elected Cloudera Enterprise, install a license:
- Click Upload License.
- Click the document icon to the left of the Select a License File text field.
- Navigate to the location of your license file, click the file, and click Open.
- Click Upload.
- Click Continue in the next screen. The Specify Hosts page displays.
- Click the Currently Managed Hosts tab.
- Choose the hosts to add to the cluster.
- Click Continue. The Select Repository page displays.
Choose Software Installation Method and Install Software
- Click Use Parcels to install CDH and managed services using parcels and then do the following:
- Choose the parcels to install. The choices you see depend on the repositories you have chosen – a repository may contain multiple parcels. Only the parcels for the latest supported service versions are configured by default. You can add additional parcels for previous versions by specifying custom repositories. For example, you can find the locations of the previous CDH 4 parcels at http://archive.cloudera.com/cdh4/parcels/. Or, if you are installing CDH 4.3 and want to use Sentry for Hive Authorization, you can add the Sentry parcel using this mechanism. To add a custom parcel repository:
- Enter the URL of the repository into More Options field, and click the + Add button. The URL you specify is added to the list of repositories listed in the Configuring Server Parcel Settings page and a parcel is added to the list of parcels on the Select Repository page. If you have multiple repositories configured, you will see all the unique parcels contained in all your repositories.
- Click Continue. Cloudera Manager installs the CDH and managed service parcels. During the parcel installation, progress is indicated for the two phases of the parcel installation process (Download and Distribution) in a separate progress bars. If you are installing multiple parcels you will see progress bars for each parcel. When the Continue button appears at the bottom of the screen, the installation process is completed. Click Continue.
- Choose the parcels to install. The choices you see depend on the repositories you have chosen – a repository may contain multiple parcels. Only the parcels for the latest supported service versions are configured by default.
- Click Continue. The Host Inspector runs to validate the installation, and provides a summary of what it finds, including all the versions of the installed components. If the validation is successful, click Finish. The Cluster Setup page displays.
Add Services
The following instructions describe how to use the Cloudera Manager wizard to configure and start CDH and managed services.
- In the first page of the Add Services wizard you
choose the combination of services to install and whether to install
Cloudera Navigator:
- Click the radio button next to the combination
of services to install:
As you select the services, keep the following in mind:
CDH 4 CDH 5 - Core Hadoop - HDFS, MapReduce, ZooKeeper, Oozie, Hive, and Hue
- Core with HBase
- Core with Impala
- All Services - HDFS, MapReduce, ZooKeeper, HBase, Impala, Oozie, Hive, Hue, and Sqoop
- Custom Services - Any combination of services.
- Core Hadoop - HDFS, YARN (includes MapReduce 2), ZooKeeper, Oozie, Hive, Hue, and Sqoop
- Core with HBase
- Core with Impala
- Core with Search
- Core with Spark
- All Services - HDFS, YARN (includes MapReduce 2), ZooKeeper, Oozie, Hive, Hue, Sqoop, HBase, Impala, Solr, Spark, and Key-Value Store Indexer
- Custom Services - Any combination of services.
- Some services depend on other services; for example, HBase requires HDFS and ZooKeeper. Cloudera Manager tracks dependencies and installs the correct combination of services.
- In a CDH 4 cluster, the MapReduce service is the default MapReduce computation framework. Choose Custom Services to
install YARN or use the Add Service functionality to add
YARN after installation completes. Important
: You can create a YARN service in a CDH 4 cluster, but it is not considered production ready. - In a CDH 5 cluster, the YARN service is the default MapReduce computation framework. Choose Custom Services to
install MapReduce or use the Add Service functionality to
add MapReduce after installation completes.Important
: In CDH 5 the MapReduce service has been deprecated. However, the MapReduce service is fully supported for backward compatibility through the CDH 5 life cycle. - The Flume service can be added only after your cluster has been set up.
- If you have chosen Data Hub Edition Trial or Cloudera Enterprise, optionally check the Include Cloudera Navigator checkbox to enable Cloudera Navigator. See the Cloudera Navigator Documentation.
- Click the radio button next to the combination
of services to install:
- Customize the assignment of role instances to hosts. The wizard evaluates the hardware configurations of the hosts to determine the best hosts for each role. The wizard assigns all worker roles to the same set of hosts to which the HDFS DataNode role is assigned. These assignments are typically acceptable, but you can reassign services to hosts of your choosing, if desired.
Click a field below a role to display a dialog containing a pageable list of hosts. If you click a field containing multiple hosts, you can also select All Hosts to assign the role to all hosts or Custom to display the pageable hosts dialog.
The following shortcuts for specifying host names are supported:- Range of hostnames (without the domain portion)
Range Definition Matching Hosts 10.1.1.[1-4] 10.1.1.1, 10.1.1.2, 10.1.1.3, 10.1.1.4 host[1-3].company.com host1.company.com, host2.company.com, host3.company.com host[07-10].company.com host07.company.com, host08.company.com, host09.company.com, host10.company.com - IP addresses
- Rack name
Click the View By Host button for an overview of the role assignment by host ranges.
- Range of hostnames (without the domain portion)
- When you are satisfied with the assignments, click Continue. The Database Setup page displays.
- On the Database Setup page, configure settings for required databases:
- Provide information for the Activity Monitor (only needed when using MapReduce), Reports Manager, and Hive Metastore, and Cloudera Navigator databases. The value you enter as the database hostname must match the value you entered for the hostname (if any) when you created the database.
- Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct the information you have provided for the databases and then try the test again. (For Hive, if you are using the embedded database, you will see a message saying the database will be created at a later point in the installation process.) The Review Changes page displays.
- Review the configuration changes to be applied.
Confirm the settings entered for file system paths. The file paths required
vary based on the services to be installed. For example, you might confirm
the NameNode Data Directory and the DataNode Data Directory for HDFS.WarningClick Continue. The wizard starts the services.
: DataNode data directories should not be placed on NAS devices. - When all of the services are started, click Continue. You will see a success message indicating that your cluster has been successfully started.
- Click Finish to proceed to the Home Page.
(Optional) Change the Cloudera Manager User
- Connect to the Cloudera Manager Admin Console.
- Do one of the following:
- Select .
- On the Status tab of the Home page, in Cloudera Management Service table, click the mgmt link.
- Select .
- Use the search box to find the property to be changed. For example, you might enter "system" to find the System User and System Group properties.
- Make any changes required to the System User and System Group to ensure Cloudera Manager uses the proper user accounts.
- Click Save Changes.
Change the Default Administrator Password
- Right-click the logged-in username at the far right of the top navigation bar and select Change Password.
- Enter the current password, and a new password twice and then click Submit.
Test the Installation
You can test the installation following the instructions in Testing the Installation.
<< Installation Path B - Manual Installation Using Cloudera Manager Packages | Installing Impala >> | |