Preparing for an upgrade

Upgrading CDP Public Cloud consists of two major steps: upgrading the Data Lake within an environment and then potentially upgrading the attached Data Hubs. If applicable, you can upgrade any data services after the Data Lake upgrade.

FreeIPA upgrades

You should periodically upgrade the environment (FreeIPA cluster) to ensure that you are running the latest OS-level security patches, but this is not required at the same time as upgrading the Data Lake and Data Hubs. While FreeIPA upgrade can be performed without a downtime, Cloudera recommends to perform this separately from other operations, preferably before upgrading the Data Lake and Data Hubs. See upgrade the FreeIPA cluster.

Data Lake upgrade workflow

Pre-upgrade tasks

  1. Review the Data Lake upgrade requirements and limitations.
  2. Carefully review the differences between the types of upgrades and what they entail.
  3. Check the Data Lake support matrix for upgrade to verify which Runtime versions you can upgrade to and from.
  4. If you have not configured your Data Lake for backup and restore, you will need to do so. The backup and restore process is integrated into the upgrade flow automatically, but successful upgrade requires that the correct IAM policies exist on the DATALAKE_ADMIN_ROLE and RANGER_AUDIT_ROLE (for backup), and the DATALAKE_ADMIN_ROLE, RANGER_AUDIT_ROLE, and LOG_ROLE (for restore).

    If your roles are not configured correctly, CDP will not be able to write the backup to the BACKUP_LOCATION_BASE path of your cloud storage.

  5. If you are performing the backup manually (as opposed to the integrated backup available during the upgrade process), you can launch the Data Lake backup from the UI or CLI. When using the CLI, you can specify to skip certain backup actions (skip HMS, Atlas metadata or Ranger audit log index backup). You can monitor the backup process using the CLI.
  6. From the Data Lake UI, run the Validate and Prepare option to check for any configuration issues and begin the Cloudera Runtime parcel download and distribution. Using the validate and prepare option does not require downtime and makes the maintenance window for an upgrade shorter. Validate and prepare also does not make any changes to your cluster and can be run independently of the upgrade itself. Although you can begin the upgrade without first running the validate and prepare option, using it will make the process smoother and the downtime shorter. (The parcels that are downloaded and distributed by the validate and prepare option are specific to the Runtime version that you have selected, so if you use validate and prepare and then decide to upgrade to a different Runtime version instead, you will need to re-run validate and prepare. Be aware that if you use validate and prepare for multiple major/minor Runtime versions, the parcels for different versions are not cleaned up and may saturate the disk. These parcels are cleaned up only once the upgrade is complete.)
Data Lake upgrade tasks
  1. Perform the upgrade, either through the UI or CLI. The type of upgrade that you perform will depend on whether a major/minor or service pack version of Runtime is available. A new OS image may also be available to upgrade to.
  2. If the upgrade fails, check the logs for manual troubleshooting info. You can also recover from failed upgrades.
  3. When the upgrade succeeds, proceed to upgrading the attached Data Hubs if required.

Data Services upgrades

Any data services in use should be upgraded after the Data Lake upgrade. Some data services have their own compatibility matrix with different versions of the Data Lake. Other data services contain features that may not be compatible with every Data Lake (Runtime) version. Refer to the data services documentation for information on Data Lake compatibility and upgrading these services.

Data Hub upgrade workflow

Pre-upgrade tasks

  1. Carefully review the differences between the types of upgrades and what they entail.
  2. Check the Data Hub upgrade support matrix to verify that the Runtime versions you want to upgrade to and from are supported for your clusters.
  3. From the Data Hub UI, run the Validate and Prepare option to check for any configuration issues and begin the Cloudera Runtime parcel download and distribution. Using the validate and prepare option does not require downtime and makes the maintenance window for an upgrade shorter. Validate and prepare also does not make any changes to your cluster and can be run independently of the upgrade itself. Although you can begin the upgrade without first running the validate and prepare option, using it will make the process smoother and the downtime shorter. (The parcels that are downloaded and distributed by the validate and prepare option are specific to the Runtime version that you have selected, so if you use validate and prepare and then decide to upgrade to a different Runtime version instead, you will need to re-run validate and prepare. Be aware that if you use validate and prepare for multiple major/minor Runtime versions, the parcels for different versions are not cleaned up and may saturate the disk. These parcels are cleaned up only once the upgrade is complete.)

Major/minor Runtime version upgrade tasks

  1. Backup the cluster data and CM configurations.
  2. Perform the upgrade, either through the UI or CLI. (Operational database clusters have a different process.)
  3. Complete any post-upgrade tasks for the type of cluster that you upgraded.
  4. For DE clusters, use the CM UI to add any configs that were not added during the upgrade.
  5. If the upgrade fails, check the Event log and the troubleshooting section.
  6. Complete any post-upgrade tasks for the type of cluster that you upgraded.
  7. For DE clusters, use the CM UI to add any configs that were not added during the upgrade.
  8. If the upgrade fails, check the Event log and the troubleshooting section.

Service pack upgrade tasks

  1. Backup the cluster data and CM configurations.
  2. Perform the upgrade, either through the UI or CLI.
  3. If the upgrade fails, check the Event log and the troubleshooting section.

OS upgrade tasks

  1. Review the Before you begin section to verify that there is no data belonging to NiFi or NiFi Registry on the root disk of the VM. Note that during an OS upgrade, any data on the root volume (parcels, service logs, custom software) will be lost.
  2. Unlike the Data Lake upgrade, OS upgrades are not integrated in the larger upgrade flow and must be performed separately, either through the UI or CLI.