Chapter 2. Getting Ready to Upgrade Ambari and HDP
When preparing to upgrade Ambari and the HDP Cluster, we strongly recommend you review this checklist of items to confirm your cluster operation is healthy. Attempting to upgrade a cluster that is operating in an unhealthy state can produce unexpected results.
Important | |
---|---|
Always upgrade Ambari to the latest version before upgrading the cluster. |
Ensure all services in the cluster are running.
Run each Service Check (found under the Service Actions menu) and confirm they execute successfully.
Clear all alerts, or understand why they are being generated. Remediate as necessary.
Confirm start and stop for all services are executing successfully.
Time service start and stops. The time to start and stop services is a big contributor to overall upgrade time so having this information handy is useful.
Download the software packages prior to the upgrade. Place them in a local repository and/or consider using a storage proxy since multi-gigabyte downloads will be required on all nodes in the cluster.
Ensure point-in-time backups are taken of all databases that support the cluster. This includes (among others) Ambari, Hive, Ranger, Druid, Superset, and Oozie.
For Large Clusters
In a large cluster, NameNode startup processes can take a long time. NameNode startup time depends not only on host properties, but also on data volume and network parameters. To ensure that the Ambari requests to start the NameNode do not timeout during an upgrade, you should configure the Ambari NameNode restart timeout parameter, upgrade.parameter.nn-restart.timeout in
/etc/ambari-server/conf/ambari.properties
on the Ambari Server host. You may need to add the restart timeout parameter and value to the Ambari server host, following a default installation. For a large cluster, you should add ten percent to the usual time (in seconds) required to restart your NameNode. Although no standard way to determine an appropriate value exists, you may use the following guidance:For example, record the time (seconds) required to restart the active NameNode for your current Ambari server version. If restarting takes 10 minutes, (600 seconds), then add
upgrade.parameter.nn-restart.timeout=660
to the
/etc/ambari-server/conf/ambari.properties
file on the Ambari Server host.After adding or resetting the Ambari NameNode restart parameter, restart your Ambari server before starting the HDP upgrade.
ambari-server restart
For Ambari Upgrades
This (Ambari 2.7.x) Upgrade Guide will help you upgrade your existing Ambari server to version 2.7.x If you are upgrading to another Ambari version, please be sure to use the Ambari Upgrade Guide for that version.
Be sure to review the Known Issues and Behavioral Changes for this, Ambari-2.7.x release.
Review supported Ambari Server database versions using the Hortonworks Support Matrix. Plan any necessary resources required for a database upgrade. You will upgrade your Ambari Server database during the Ambari Upgrade procedure.
Ambari 2.7 only supports the following operations when running against a HDP 2.6 cluster:
Run Service Checks
Start, Stop, Restart a Service
Change Configuration
Enable & Disable Maintenance Mode
Disable Auto Start
Remove Services
Remove Components
Remove & Decommission Hosts
For HDP Cluster Upgrades
Ensure sufficient disk space on
/usr/hdp/<version>
(roughly 3GB for each additional HDP release).If you plan to add new services available with HDP to your cluster, the new services might include new service accounts. Any operational procedures required to support these new service accounts should be performed prior to the upgrade. The accounts will typically be required on all nodes in the cluster.
Additional components will be added to the cluster as part of the HDP 3.0 upgrade including YARN ATSv2, YARN Registry DNS, and additional Hive Clients required for the Spark History Server. If your cluster has kerberos enabled, you must configure Ambari to manage the Kerberos admin credentials prior to the upgrade so the appropriate Kerberos principals can be created during the upgrade process.
If your cluster includes Storm, document any running Storm topologies, as they will need to be stopped during the upgrade process.
More Information