Upgrading Cloudera Manager 4 to Cloudera Manager 5
This process applies to upgrading all versions of Cloudera Manager 4 to Cloudera Manager 5.
In most cases it is possible to complete the following upgrade without shutting down most CDH services, although you may need to stop some dependent services. CDH daemons can continue running, unaffected, while Cloudera Manager is upgraded. The upgrade process does not affect your CDH installation. However, to take advantage of Cloudera Manager 5 features, after the upgrade all services will have to be restarted. After upgrading Cloudera Manager you may also want to upgrade CDH 4 clusters to CDH 5.
Upgrading from a version of Cloudera Manager 4 to the latest version of Cloudera Manager involves the following broad steps.
- Review Warnings and Notes
- Perform Prerequisite Steps
- Stop Selected Services
- Stop Cloudera Manager Server, Database, and Agent
- (Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
- Upgrade Cloudera Manager Server Packages
- Start the Cloudera Manager Server
- Upgrade Cloudera Manager Agent Packages
- Verify the Upgrade Succeeded
- Add Hive Gateway Roles
- Configure Cluster Version for Package Installs
- Upgrade Impala
- (Optional) Hard Restart Cloudera Manager Agents
- (Optional) Restart All Services
- Restart Roles of Audited Services
- Start Selected Services
- Deploy Updated Client Configurations
- Test the Installation
- (Optional) Upgrade CDH
Review Warnings and Notes
- Cloudera Management Service databases
Cloudera Manager 5 stores Host Monitor and Service Monitor data in a local datastore instead of in an embedded PostgreSQL or external database. The Cloudera Manager upgrade process automatically migrates data from existing databases to the local datastore. For further information, see Data Storage for Monitoring Data.
The Host Monitor and Service Monitor databases are stored on the partition hosting /var. Ensure that you have at least 20 GB available on this partition.
If you have been storing the data in an external database, you can drop those databases after upgrade completes.
- Impala
Cloudera Manager 5 supports Impala 1.2.1 or later. If the version of your Impala service is 1.1 or earlier, the following upgrade instructions will work, but once the upgrade has completed, you will see a validation warning for your Impala service and you will not be able to restart your Impala (or Hue) services until you upgrade your Impala service to 1.2.1 or later. If you want to continue to use Impala 1.1 or earlier, do not upgrade to Cloudera Manager 5.
- Navigator
If you have enabled auditing with Cloudera Navigator, during the process of upgrading to Cloudera Manager 5 auditing is suspended and is only restarted when you restart the roles of audited services.
- Hard Restart of Cloudera Manager AgentsCertain circumstances will require that you hard restart the Cloudera Manager Agent on each host:
- To deploy a fix to an issue where Cloudera Manager didn't always correctly restart services
- To take advantage of the maximum file descriptor feature
- To enable HDFS DataNodes to start if you plan to perform the step (Optional) Upgrade CDH after upgrading Cloudera Manager
- Hive
Cloudera Manager 4.5 added support for Hive, which includes the Hive Metastore Server role type. This role manages the Metastore process when Hive is configured with a remote Metastore.
When upgrading from Cloudera Manager prior to 4.5, Cloudera Manager automatically creates new Hive service(s) to capture the previous implicit Hive dependency from Hue and Impala. Your previous services will continue to function without impact. If Hue was using a Hive Metastore backed by a Derby database, then the newly created Hive Metastore Server will also use Derby. Since Derby does not allow concurrent connections, Hue will continue to work, but the new Hive Metastore Server will fail to run. The failure is harmless (because nothing uses this new Hive Metastore Server at this point) and intentional, to preserve the set of cluster functionality as it was before upgrade. Cloudera discourages the use of a Derby backed Hive Metastore due to its limitations. You should consider switching to a different supported database.
Cloudera Manager provides a Hive configuration option to bypass the Hive Metastore Server. When this configuration is enabled, Hive clients, Hue, and Impala connect directly to the Hive Metastore database. Prior to Cloudera Manager 4.5, Hue and Impala connected directly to the Hive Metastore database, so the bypass mode is enabled by default when upgrading to Cloudera Manager 4.5 or later. This is to ensure the upgrade doesn't disrupt your existing setup. You should plan to disable the bypass mode, especially when using CDH 4.2 or later. Using the Hive Metastore Server is the recommended configuration and the WebHCat Server role requires the Hive Metastore Server to not be bypassed. To disable bypass mode, see Disabling Bypass Mode.
Cloudera Manager 4.5 or later also supports HiveServer2 with CDH 4.2. In CDH 4 HiveServer2 is not added by default, but can be added as a new role under the Hive service (see Role Instances). In CDH 5, HiveServer2 is a mandatory role.
- When you install on EC2 using the Cloud wizard, the wizard creates a security group that by default opens ports used by Cloudera Manager and CDH components. Before upgrading, you must manually open these ports:
- Upgrades from Cloudera Manager 4.7.2 or earlier - 7185 for the Cloudera Manager Event Server.
- Upgrades from Cloudera Manager 5.0.0 beta 2 or earlier - 18080 and 18081 for the Spark master and worker web UI ports.
- If you are upgrading from Cloudera Manager Free Edition (version 4.5 or earlier) you are upgraded to Cloudera Express, which includes a number of features that were previously available only with Cloudera Enterprise. Of those features, activity monitoring requires a database. Thus, upon upgrading to Cloudera Manager 5, you must specify activity monitor database information. You have the option to use the embedded PostgreSQL database, which Cloudera Manager can set up automatically.
Perform Prerequisite Steps
- Upgrade Cloudera Manager 3.7.x to Cloudera Manager 4 - See Upgrading Cloudera Manager 3.7.x.
- Upgrade all CDH 3 clusters to CDH 4 - See Upgrading CDH 3. If you attempt to upgrade to Cloudera Manager 5 and Cloudera Manager 4 is managing a CDH 3 cluster, the Cloudera Manager 5 server will not start, and you will be notified that you must downgrade to Cloudera Manager 4. Instructions for downgrading may be found here: Reverting a Failed Cloudera Manager Upgrade. After downgrading, you must upgrade your CDH 3 cluster to CDH 4 before you can upgrade Cloudera Manager. See Upgrading CDH 3.
- Obtain host credentials - You must have SSH access and be able to log in using a root account or an account that has password-less sudo permission. See Cloudera Manager Requirements for more information.
- Stop running commands - Use the Admin Console to check for any running commands. You can either wait for commands to complete or abort any running commands. For more information on viewing and aborting running commands, see Viewing Running and Recent Commands.
- Prepare databases - See Database Considerations for Cloudera Manager Upgrades.
- Cloudera Manager 5 supports HDFS High Availability only with Automatic Failover. If your cluster has enabled High Availability without Automatic Failover, you must enable Automatic Failover before upgrading to Cloudera Manager 5. See Configuring HDFS High Availability.
Stop Selected Services
Condition | Procedure |
---|---|
Running a version of Cloudera Manager that has the Cloudera Management Service | Stop the Cloudera Management Service. |
Upgrading from Cloudera Manager 4.5 or later, and using the embedded PostgreSQL database for the Hive Metastore | Stop the services that have a
dependency on the Hive Metastore (Hue, Impala, and Hive). You
will not be able to stop the Cloudera Manager Server database
while these services are running. If you attempt to upgrade
while the embedded database is running, the upgrade will fail.
Stop services that depend on the Hive Metastore in the following
order:
|
Running Cloudera Navigator | Stop any of the following roles whose
service's Queue Policy configuration
(navigator.batch.queue_policy) is set to SHUTDOWN:
|
Stop Cloudera Manager Server, Database, and Agent
- On the host running the Cloudera Manager Server, stop the Cloudera Manager Server:
$ sudo service cloudera-scm-server stop
- If you are using the embedded PostgreSQL database for Cloudera Manager, stop the database:
$ sudo service cloudera-scm-server-db stop
Important: If you are not running the embedded database service and you attempt to stop it, you will get a message to the effect that the service cannot be found. If instead you get a message that the shutdown failed, this means the embedded database is still running, probably due to services connected to the Hive Metastore. Do not proceed with the installation until you have stopped all your Metastore-dependent services and the database successfully shuts down (restart the Cloudera Manager server to shut down services as necessary). If you continue without solving this, your upgrade will fail and you will be left with a non-functional Cloudera Manager installation. - If the Cloudera Manager host is also running the Cloudera Manager Agent, stop the Cloudera Manager Agent:
$ sudo service cloudera-scm-agent stop
(Optional) Upgrade JDK on Cloudera Manager Server Host and Agent Hosts
If you are manually upgrading the Cloudera Manager Agent packages in Upgrade Cloudera Manager Agent Packages, and you plan to upgrade to CDH 5, install JDK1.7u45 on the Agent hosts following the instructions in Java Development Kit Installation.
If you are not running Cloudera Manager Server on the same host as a Cloudera Manager Agent, and you want all hosts to run the same JDK version, optionally install JDK1.7u45 on that host.
Upgrade Cloudera Manager Server Packages
- To upgrade the Cloudera Manager Server Packages, you can upgrade from Cloudera's repository at http://archive.cloudera.com/cm5/ or you can create your own repository, as described in Understanding Custom Installation Solutions. Creating your own repository is necessary if you are upgrading a cluster that does not have access to the Internet.
- Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm5/ and navigating to the directory that matches your operating system.
For example, for Red Hat or CentOS 6, you would navigate to http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/. Within that directory, find the repo file that contains information including the repository's base URL and GPG key. The contents of the cloudera-manager.repo file might appear as follows:
[cloudera-manager] # Packages for Cloudera Manager, Version 5, on RedHat or CentOS 6 x86_64 name=Cloudera Manager baseurl=http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/5/ gpgkey = http://archive.cloudera.com/cm5/redhat/6/x86_64/cm/RPM-GPG-KEY-cloudera gpgcheck = 1
For Ubuntu or Debian systems, the repo file can be found by navigating to the appropriate release directory, for example, http://archive.cloudera.com/cm4/debian/wheezy/amd64/cm. The repo file, in this case, cloudera.list, may appear as follows:# Packages for Cloudera Manager, Version 5, on Debian 7.0 x86_64 deb http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib deb-src http://archive.cloudera.com/cm5/debian/wheezy/amd64/cm wheezy-cm5 contrib
- Replace the repo file in the configuration location for the package management software for your system.
Operating System Commands RHEL Copy cloudera-manager.repo to /etc/yum.repos.d/. SLES Copy cloudera-manager.repo to /etc/zypp/repos.d/. Ubuntu or Debian Copy cloudera.list to /etc/apt/sources.list.d/. - Run the following commands:
Operating System Commands RHEL $ sudo yum clean all $ sudo yum upgrade 'cloudera-*'
Note: - yum clean all cleans up yum's cache directories, ensuring that you download and install the latest versions of the packages
- If your system is not up to date, and any underlying system components need to be upgraded before this yum update can succeed. yum will tell you what those are.
SLES $ sudo zypper clean --all $ sudo zypper up -r http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:$ sudo zypper clean --all $ sudo zypper rr cm $ sudo zypper ar -t rpm-md http://myhost.example.com/<path_to_cm_repo>/cm $ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian Use the following commands to clean cached repository information and update Cloudera Manager components: $ sudo apt-get clean $ sudo apt-get update $ sudo apt-get dist-upgrade $ sudo apt-get install cloudera-manager-server cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file version:Configuration file `/etc/cloudera-scm-agent/config.ini' ==> Modified (by you or by a script) since installation. ==> Package distributor has shipped an updated version. What would you like to do about it ? Your options are: Y or I : install the package maintainer's version N or O : keep your currently-installed version D : show the differences between the versions Z : start a shell to examine the situation The default action is to keep your current version.
You will receive a similar prompt for /etc/cloudera-scm-server/db.properties. Answer N to both these prompts. The config.ini file should be carefully inspected and the files merged together to ensure the new entries are incorporated.
- Find the Cloudera repo file for your distribution by starting at http://archive.cloudera.com/cm5/ and navigating to the directory that matches your operating system.
OS | Packages |
---|---|
RPM-based distributions |
$ rpm -qa 'cloudera-manager-*' cloudera-manager-agent-5.0.7-0.cm5.p0.932.el6.x86_64 cloudera-manager-server-5.0.7-0.cm5.p0.932.el6.x86_64 cloudera-manager-daemons-5.0.7-0.cm5.p0.932.el6.x86_64 |
Ubuntu or Debian |
~# dpkg-query -l 'cloudera-manager-*' Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Description +++-======================-======================-============================================================ ii cloudera-manager-agent 5.0.7-0.cm5.p0.932~sq The Cloudera Manager Agent ii cloudera-manager-daemo 5.0.7-0.cm5.p0.932~sq Provides daemons for monitoring Hadoop and related tools. ii cloudera-manager-serve 5.0.7-0.cm5.p0.932~sq The Cloudera Manager Server |
Start the Cloudera Manager Server
- If you are using the embedded PostgreSQL database for Cloudera Manager, start the database:
$ sudo service cloudera-scm-server-db start
- Start the Cloudera Manager Server:
$ sudo service cloudera-scm-server start
You should see the following:Starting cloudera-scm-server: [ OK ]
Upgrade Cloudera Manager Agent Packages
- Log in to the Cloudera Manager Admin Console.
- Upgrade hosts using one of the following methods:
- Cloudera Manager
installs Agent software
- Select Yes, I would like to upgrade the Cloudera Manager Agent packages now and click Continue.
- Select the release of the Cloudera Manager Agent to install. Normally, this will be the Matched Release for this Cloudera Manager Server. However, if you used a custom repository (that is, a repository other than archive.cloudera.com) for the Cloudera Manager server, select Custom Repository and provide the required information. The custom repository allows you to use an alternative location, but that location must contain the matched Agent version. Click Continue to proceed to the Configure Java Encryption screen.
- If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster, check the Install Java Unlimited Strength Encryption Policy Files checkbox to install unlimited strength policy files for Java. This is required because during upgrade Cloudera Manager installs a copy of the Java 7 JDK, which does not include the unlimited strength policy files. Click Continue to proceed to the Upgrade Cloudera Manager Agent Packages screen.
- Specify credentials and
initiate Agent installation:
- Select root or enter the user name for an account that has password-less sudo permission.
- Select an authentication
method:
- If you choose to use password authentication, enter and confirm the password.
- If you choose to use public-key authentication provide a passphrase and path to the required key files.
- You can choose to specify an alternate SSH port. The default value is 22.
- You can specify the maximum number of host installations to run at once. The default value is 10.
- Click Continue. The Cloudera Manager Agent packages are installed.
- Manually install
Agent software. On all cluster hosts except the Cloudera Manager
server host:
- Select No, I would like to skip the agent upgrade now and click Continue.
- Copy the appropriate repo file as described in step 3 of Upgrade Cloudera Manager Server Packages.
- Run the following commands:
Operating System Commands RHEL $ sudo yum clean all $ sudo yum upgrade 'cloudera-*'
Note: - yum clean all cleans up yum's cache directories, ensuring that you download and install the latest versions of the packages
- If your system is not up to date, and any underlying system components need to be upgraded before this yum update can succeed. yum will tell you what those are.
SLES $ sudo zypper clean --all $ sudo zypper up -r http://archive.cloudera.com/cm5/sles/11/x86_64/cm/5/
To download from your own repository:$ sudo zypper clean --all $ sudo zypper rr cm $ sudo zypper ar -t rpm-md http://myhost.example.com/<path_to_cm_repo>/cm $ sudo zypper up -r http://myhost.example.com/<path_to_cm_repo>
Ubuntu or Debian Use the following commands to clean cached repository information and update Cloudera Manager components: $ sudo apt-get clean $ sudo apt-get update $ sudo apt-get dist-upgrade $ sudo apt-get install cloudera-manager-agent cloudera-manager-daemons
As this process proceeds, you may be prompted concerning your configuration file version:Configuration file `/etc/cloudera-scm-agent/config.ini' ==> Modified (by you or by a script) since installation. ==> Package distributor has shipped an updated version. What would you like to do about it ? Your options are: Y or I : install the package maintainer's version N or O : keep your currently-installed version D : show the differences between the versions Z : start a shell to examine the situation The default action is to keep your current version.
You will receive a similar prompt for /etc/cloudera-scm-server/db.properties. Answer N to both these prompts. The config.ini file should be carefully inspected and the files merged together to ensure the new entries are incorporated.
- If local laws permit you to deploy unlimited strength encryption, and you want to run a secure cluster, install the unlimited strength JCE policy files. This is required because during upgrade Cloudera Manager installs a copy of the Java 7 JDK, which does not include the unlimited strength policy files.
- Cloudera Manager
installs Agent software
- Click Continue. The Host Inspector runs to inspect your managed hosts for correct versions and configurations. If there are problems, you can make changes and then re-run the inspector. When you are satisfied with the inspection results, click Continue and then click Finish.
- If you are upgrading from a free version of Cloudera Manager prior to 4.6, click Continue to assign the Cloudera Management Services roles to hosts.
- If you are upgrading from a free version of Cloudera Manager prior to
4.6 to Cloudera Enterprise, specify required databases:
- Configure settings for required databases:
- Choose the database type:
- Leave the default setting of Use Embedded Database to have Cloudera Manager create and configure all required databases. Make a note of the auto-generated passwords.
- Select Use Custom Databases to specify external databases. Enter the database host, database type, database name, username, and password for the databases that you created when you set up databases for Cloudera Manager.
- Provide information for the Activity Monitor (only needed when using MapReduce), Reports Manager, and Hive Metastore, and Cloudera Navigator databases. The value you enter as the database hostname must match the value you entered for the hostname (if any) when you created the database.
- Click Test Connection to confirm that Cloudera Manager can communicate with the databases using the information you have supplied. If the test succeeds in all cases, click Continue; otherwise check and correct the information you have provided for the databases and then try the test again. (For Hive, if you are using the embedded database, you will see a message saying the database will be created at a later point in the installation process.) The Review Changes page displays.
- Choose the database type:
- Configure settings for required databases:
- Review the configuration changes to be applied.
Confirm the settings entered for file system paths. The file paths required
vary based on the services to be installed. For example, you might confirm
the NameNode Data Directory and the DataNode Data Directory for HDFS.WarningClick Continue. The wizard starts the services.
: DataNode data directories should not be placed on NAS devices. - Click Finish. If you are upgrading
from Cloudera Manager prior to 4.5:
- Select the host for the Hive Metastore Server role.
- Review the configuration values and click Accept to continue. Note
: - If Hue was using a Hive Metastore backed by a Derby database, then the newly created Hive Metastore Server will also use Derby. Since Derby does not allow concurrent connections, Hue will continue to work, but the new Hive Metastore Server will fail to run. The failure is harmless (because nothing uses this new Hive Metastore Server at this point) and intentional, to preserve the set of cluster functionality as it was before upgrade. Cloudera discourages the use of a Derby backed Hive Metastore due to its limitations. You should consider switching to a different supported database.
- Prior to Cloudera Manager 4.5, Hue and Impala connected directly to the Hive Metastore database, so the bypass mode is enabled by default when upgrading to Cloudera Manager 4.5 or later. This is to ensure the upgrade doesn't disrupt your existing setup. You should plan to disable the bypass mode, especially when using CDH 4.2 or later. Using the Hive Metastore Server is the recommended configuration and the WebHCat Server role requires the Hive Metastore Server to not be bypassed. To disable bypass mode, see Disabling Bypass Mode. After changing this configuration, you must re-deploy your client configurations, restart Hive, and restart any Hue or Impala services configured to use that Hive.
- If you are using CDH 4.0 or CDH 4.1, see known issues related to Hive in Known Issues and Workarounds in Cloudera Manager 5.
- If you are upgrading from Cloudera Manager prior to 4.8, select where the Impala Catalog Server role will run.
All services (except for the services you stopped in Stop Selected Services) should be running.
Verify the Upgrade Succeeded
- In the Cloudera Manager Admin Console, click the Hosts tab.
- Click Host Inspector. On large clusters, the host inspector may take some time to finish running. You must wait for the process to complete before proceeding to the next step.
- Click Show Inspector Results. All results from the host inspector process are displayed including the currently installed versions. If this includes listings of current component versions, the installation completed as expected.
Add Hive Gateway Roles
- In the Cloudera Manager Admin Console, from the Home page click the Hive service.
- Go to the Instances tab, and click the Add button. This opens the Add Role Instances page.
- Select the hosts on which you want a Hive Gateway role to run. This will ensure that the Hive client configurations are deployed on these hosts.
Configure Cluster Version for Package Installs
Because Cloudera Manager does not manage service software installed as packages, during certain upgrade scenarios Cloudera Manager assigns a default CDH version of a cluster. You must manually configure the cluster CDH version to match the package CDH version following the procedure in Configuring the CDH Version for a Cluster in Managing Clusters with Cloudera Manager. If you do not set the cluster CDH version to the package CDH version, Cloudera Manager will incorrectly enable and disable service features based on the configured CDH version.
Upgrade Impala
If your version of Impala was 1.1 or earlier, upgrade to Impala 1.2.1 or later.
(Optional) Hard Restart Cloudera Manager Agents
- To deploy a fix to an issue where Cloudera Manager didn't always correctly restart services
- To take advantage of the maximum file descriptor feature
- To enable HDFS DataNodes to start if you plan to perform the step (Optional) Upgrade CDH after upgrading Cloudera Manager
- Stop all services, including the Cloudera Management Service.
- On all hosts with Cloudera Manager Agents, run the command:
$ sudo service cloudera-scm-agent hard_restart
- Start all services.
(Optional) Restart All Services
- The Cloudera Manager Agent has been upgraded and restarted.
- The monitored roles have been restarted.
- Restart all services, including the Cloudera Management Service:
- From the Home page click next to the cluster name and select Restart.
- From the Home page click next to the Cloudera Management Service and select Restart.
Restart Roles of Audited Services
- HDFS - NameNode
- HBase - Master and RegionServers
- Hive - HiveServer2
- Hue - Beeswax Server
Start Selected Services
- Do one of the following, depending on which services you shut down:
- From the Home page click next to the cluster name and select Start.
- From the Home page click next to the name of each service you shut down and select Start.
- In the confirmation dialog that displays, click Start.
Deploy Updated Client Configurations
- From the Home page click next to the cluster name and select Deploy Client Configuration.
- In the confirmation dialog that displays, click Deploy Client Configuration.
Test the Installation
When you have finished the upgrade to Cloudera Manager, you can test the installation to verify that the monitoring features are working as expected; follow instructions under Testing the Installation.
(Optional) Upgrade CDH
Cloudera Manager 5 can manage both CDH 4 and CDH 5, so upgrading existing CDH 4 installations is not required, but you may want to upgrade to the latest version. For more information on upgrading CDH, see Upgrading CDH and Managed Services.
<< Upgrading Cloudera Manager 5 to the Latest Cloudera Manager | Upgrading Cloudera Manager 3.7.x >> | |