High-level upgrade steps

Refer to this section for high-level upgrade steps and links to more detailed upgrade documentation. This document is not meant to provide you with complete upgrade steps, but rather it outlines the general upgrade path and provides you with the links to existing upgrade documentation.

  1. Upgrade Data Lake and all Cloudera Data Hub clusters to the latest service pack and OS
  2. Upgrade FreeIPA to RHEL 8
  3. Upgrade Data Lake and Cloudera Data Hub clusters to the latest Cloudera Runtime 7.2.17 and the latest RHEL 8 OS image
  4. Resize Medium Duty Data Lake to Enterprise Data Lake
  5. Upgrade Data Lake and Cloudera Data Hub clusters from 7.2.17 to 7.2.18

Step 1: Upgrade Data Lake and all Cloudera Data Hub clusters to the latest service pack and OS

Upgrade Data Lake and Cloudera Data Hub to the latest service pack of their Cloudera Runtime version.

Follow these steps first for your Data Lake and then for each attached Cloudera Data Hub separately. The examples and screenshots below assume that your cluster is using Cloudera Runtime 7.2.16. For older Cloudera Runtime, just replace 7.2.16 with your actual Cloudera Runtime version.

  1. Upgrade Data Lake and Cloudera Data Hub to the latest service pack of their Runtime version.

    To upgrade your Data Lake and all Cloudera Data Hub clusters to the latest service pack of their Runtime version, perform the following for your Data Lake and each Cloudera Data Hub from the Upgrade UI each cluster:

    1. Select the upgrade target that matches your current Cloudera Runtime version. If you are upgrading a Data Lake, you will see 7.2.16 (Cloudera Runtime upgrade, OS; centos7); for Cloudera Data Hub clusters you will see Select 7.2.16 (Cloudera Runtime upgrade). Select this line and then run Validate and Prepare.
    2. Once Validate and Prepare completes, return to this page, select the same line again and run Upgrade:

      For Data Lakes, this will perform the OS image upgrade, so step 2 below is required for Cloudera Data Hub clusters only.
  2. Upgrade the OS image on all Cloudera Data Hub clusters.

    Additionally, you are required to upgrade the OS image on all Cloudera Data Hub clusters:

    1. Open the Cloudera Data Hub Upgrade UI again and select the upgrade target that matches your current Cloudera Runtime version and run Validate and Prepare. Select 7.2.16 (OS upgrade, OS; centos7).

    2. Once Validate and Prepare completes, return to this page, select the same lineagain and run Upgrade.

USEFUL LINKS:

Step 2: Upgrade FreeIPA to RHEL 8

Before your Data Lake or Cloudera Data Hub clusters can be upgraded to RHEL 8, you first need to upgrade your environment (FreeIPA) cluster.

Open the FreeIPA tab on your environment’s details page and select Upgrade. From the drop-down menu select FreeIPA (Latest, OS: redhat8) and run the upgrade:

USEFUL LINKS:

Step 3: Upgrade Data Lake and Cloudera Data Hub clusters to the latest Cloudera Runtime 7.2.17 and the latest RHEL 8 OS image

Upgrade your clusters to the latest 7.2.17 and to RHEL 8.

Perform the following steps for the Data Lake and then each Cloudera Data Hub:

  1. Understand RHEL 8 requirements.

    Before performing the OS version change from CentOS 7 to RHEL 8 make sure you have understood the requirements mentioned in Upgrading from CentOS to RHEL.

  2. Upgrade to the latest 7.2.17 service pack if needed.

    To determine whether you need to perform a Cloudera Runtime and OS upgrade, or only an OS upgrade, check whether you see the target version in the drop-down labeled 7.2.17 (OS upgrade, OS: redhat8). If you do, you only need an OS upgrade. and you can skip to step 3 below. If you do not see it, perform the following steps for a Cloudera Runtime upgrade:
    1. If you are upgrading a Data Lake, select 7.2.17 (Cloudera Runtime upgrade, OS: redhat8). For upgrading a Cloudera Data Hub, select 7.2.17 (Cloudera Runtime upgrade). Next, run Validate and Prepare.
    2. Once Validate and Prepare completes, return to this page, select the same line again, and run Upgrade.

      Data Lake UI

      Cloudera Data Hub UI

      For Data Lakes this will perform the RHEL 8 upgrade as well.
  3. Upgrade all Cloudera Data Hub clusterss to RHEL.

    Open the Upgrade UI again and the 7.2.17 (OS upgrade, OS: redhat8) option will now be available in the drop-down menu.

    1. Select Select 7.2.17 (OS upgrade, OS: redhat8) and run Validate and Prepare.

    2. Once Validate and Prepare completes, return to this page, select Select 7.2.17 (OS upgrade, OS: redhat8) again and run Upgrade.

USEFUL LINKS:

Step 4: Resize Medium Duty Data Lake to Enterprise Data Lake

If using a Medium Duty Data Lake, you should resize it to Enterprise Data Lake.

Before you begin, note the following:

  • Prior to attempting the resize, ensure that all Cloudera Data Hub clusters in your environment have been upgraded to Cloudera Runtime 7.2.17 and your data services have been upgraded to the latest version available.

  • During the resize operation your cluster’s Cloudera Manager and Cloudera Runtime configuration will be updated to the most recent recommendations (performance tuning, etc). You will be required to re-apply these custom configurations after you have performed the resize to Enterprise Data lake.

  • There might be a requirement to vertically scale the Data Lake nodes to increase the available resources in case high resource utilization is observed after Data Lake resize to Enterprise and Operating System upgrade to RHEL from CentOS.

  • The resize operation will perform an SDX backup and restore in the background. This involves writing and reading from cloud storage. If you have not yet used data lake backup, please make sure your environment’s storage permissions are configured correctly for these operations.

  • Ensure that all Data Hubs in your environment have been upgraded to Runtime 7.2.17 and your data services have been upgraded to the latest version available.

  • The resizing operation requires a downtime and should be performed during a maintenance window. No metadata changes may occur during the resizing, as these changes will no longer be present once the resizing operation completes (the previously backed up metadata is being restored). Suspend any operations that may result in any SDX metadata change during the resizing operation.

  • Cloudera Data Hub clusters should be stopped before the resizing operation begins. For any cluster that cannot be stopped, stop all of the services on the Cloudera Data Hub through the Cloudera Manager UI.

  • With Cloudera DataFlow 2.0 or lower, some flows must be re-created after a resizing operation.

  • Review Data Lake resizing: Prerequisites.

Follow the steps for Data Lake resizing. During this operation, the metadata maintained in your current Data Lake is automatically backed up, a new Enterprise Data Lake is created within the environment, and the metadata is automatically restored to this new cluster. As mentioned above, any custom cluster configuration that you previously made will need to be reapplied after the resize completes.

The maintenance window required for this operation depends on the size of your SDX metadata. When you open the Resize cluster UI, it will show you the estimated duration of the operation.

USEFUL LINKS:

Step 5: Upgrade Data Lake and Cloudera Data Hub clusters from 7.2.17 to 7.2.18

Perform the following steps for your Data Lake and each Cloudera Data Hub:

If you are upgrading a Data Lake, all Cloudera Data Hub clusters in your environment should be stopped and Data Services should not be running workloads for the duration of the operation.

  1. Once Cloudera Runtime 7.2.18 (Cloudera Runtime upgrade, OS: redhat8) is available to select from the Upgrade UI, select it and run Validate and Prepare.

  2. Once Validate and Prepare completes, return to this page, select 7.2.18 (Cloudera Runtime upgrade, OS: redhat8) again and run Upgrade.