Upgrade Checklist FAQ
During the preparation for an upgrade, Cloudera recommends carefully reviewing the questions and answers below.
- What is the length of the available maintenance window?
-
Currently, Data Lake backup and restore requires a maintenance window, where no metadata changes occur. Furthermore, Data Hubs need to be stopped during an upgrade.
The CDP Public Cloud environment does not need to be upgraded in one go: you may opt to upgrade the Data Lake and all attached Data Hubs together, or start with the Data Lake upgrade only and perform individual Data Hub upgrades in consecutive steps. However, after you perform a major/minor Data Lake upgrade, you must upgrade all attached Data Hub clusters, as the Data Hubs must run the same major/minor version of Runtime as the Data Lake.
- What type of upgrade is required?
- Currently, there are three types of upgrades available to Data Lake and Data Hub clusters: maintenance upgrades; minor/major version upgrades; and OS upgrades. Maintenance and minor/major version upgrades install a newer version of Cloudera Manager and/or Cloudera Runtime. OS upgrades for Data Lakes and Data Hubs are complementary and will bring the image of the cluster hosts to a newer version. If you plan to also perform an OS upgrade, plan the maintenance window accordingly.
- Are ephemeral disks used for user or workload-related persistent data?
-
Major/minor version upgrades as well as maintenance upgrades will bring Cloudera Manager and Cloudera Runtime to the selected version without impacting the underlying VM. However, OS upgrades will recreate the underlying VM with a fresh image, which results in the loss of any data stored on ephemeral disks.
If you are currently storing user or workload-related data on volumes using ephemeral disks, please reach out to Cloudera support while planning for the upgrade.
- What Data Hub cluster templates are in use? Are you using custom templates?
- Check whether in-place upgrade is supported for your built-in or custom data hub template. Depending on the type and version of the Data Hub, additional backup steps, manual configuration changes or post-upgrade steps may be required. Check specific steps for upgrading the OS if you use Flow Management. Operational Database clusters have a different upgrade process.
- What is the size of the SDX / Data Lake metadata?
- SDX metadata includes the Hive Metastore database, Ranger audit log index, as well as Atlas metadata. If you are planning to perform a Data Lake backup before an upgrade (which is recommended), prepare your maintenance window accordingly. CDP supports skipping the backup of certain metadata to reduce the time required for backup and restore operations.
- Are you using Data Services?
- If you have deployed Cloudera Data Engineering, Data Warehouse, Data Flow, or Machine Learning in your environment, it is recommended that you upgrade them to the latest version before upgrading your Data Lake to a more recent minor/major version.