Before you begin

Before you begin a Data Lake upgrade, note the requirements and limitations listed below.

Requirements

  • Required role to perform Data Lake upgrade: EnvironmentAdmin or Owner over the environment

  • The Data Lake must be running and in a healthy state.
  • You should stop any Data Hubs and any data services (such as CDW or CDE) that are running. For the Cloudera Data Warehouse Experience, you should stop any Virtual Warehouses that are running prior to beginning any upgrade or backup/restore process. Stopping Experiences is not required for service pack upgrades, but any Data Hubs or data services that are not stopped will error out during the upgrade process.
  • If you use a custom image catalog and you don’t see upgrades available, you may need to update your custom image catalog with new images.
  • If the upgrade involves upgrading from CentOS to RHEL, review the Prerequisites for upgrading from CentOS to RHEL.
  • Expect at least two hours of downtime while the upgrade completes. Plan the upgrade during a time of low activity.
  • Optionally, you can take a backup of the Data Lake. The Data Lake upgrade process will automatically take a backup before the upgrade procedure begins, but you have the option of disabling the automatic backup if you would prefer to do this step separately. For instructions on performing a backup and restore, see Backup and restore for the Data Lake. If the upgrade fails for any reason, you can restore the Data Lake from the backup.

The upgrade requires 27 GB space on the CM server node and 20 GB on every other instance. If space is insufficient on your Data Lake, upgrade will not be permitted.

Limitations

Note the following limitations for the Data Lake upgrade:

  • Data Lake upgrade does not include the upgrade of the FreeIPA software or the operating system on the instance(s) running FreeIPA. To upgrade FreeIPA, see Upgrade FreeIPA.

  • Data Lake resizing (for example, moving from a light duty to a medium duty Data Lake) during an upgrade is not supported.

  • Before being able to upgrade to Cloudera Runtime 7.3.1, you need to upgrade Data Lakes and all Cloudera Data Hub databases to a to PostgreSQL 14.
  • When upgrading the Data Lake from Cloudera Runtime 7.2.17 to 7.2.18 or 7.3.1, if Iceberg metadata is required to be captured in Atlas, then do not use mixed versions (7.2.18 or 7.3.1 Data Lake with 7.2.17 Cloudera Data Hub), as the mismatch in the Iceberg models between 7.2.17 and 7.2.18 or 7.3.1 can cause problems.
  • If a Data Lake has attached Data Hubs that are not eligible for upgrade, the Data Lake itself is not eligible for upgrade. You must delete any Data Hubs that are ineligible for upgrade before proceeding with the Data Lake upgrade. See Data Hub Upgrade for more information about which Data Hubs are eligible for upgrade.

  • Service pack upgrades for RAZ-enabled Data Lakes are available only for Runtime versions 7.2.7+.

  • Major/minor version upgrades for RAZ-enabled Data Lakes are available only for Runtime versions 7.2.12+.

  • A Data Lake must be using Runtime 7.2.17 to be eligible for CentOS to RHEL upgrade. If you do not see the option to upgrade from CentOS to RHEL, ensure that your Data Lake is using Runtime 7.2.17.

  • Runtime 7.2.18 and newer do not support Medium Duty Data Lake shape and no upgrades are possible from 7.2.17 to 7.2.18 without doing a resize operation on the Data Lake prior to upgrading to 7.2.18.