High-level upgrade steps

Refer to this section for high-level upgrade steps and links to more detailed upgrade documentation. This document is not meant to provide you with complete upgrade steps, but rather it outlines the general upgrade path and provides you with the links to existing upgrade documentation.

  1. Upgrade Data Lake and all Data Hubs to the latest service pack and OS
  2. Upgrade FreeIPA to RHEL 8
  3. Upgrade Data Lake and Data Hubs to the latest Runtime 7.2.17 and the latest RHEL 8 OS image
  4. Resize Medium Duty Data Lake to Enterprise Data Lake
  5. Upgrade Data Lake and Data Hubs from 7.2.17 to 7.2.18

Step 1: Upgrade Data Lake and all Data Hubs to the latest service pack and OS

Upgrade Data Lake and Data Hub to the latest service pack of their Runtime version.

Follow these steps first for your Data Lake and then for each attached Data Hub separately. The examples and screenshots below assume that your cluster is using Runtime 7.2.16. For older Runtimes, just replace 7.2.16 with your actual Runtime version.

  1. Upgrade Data Lake and Data Hub to the latest service pack of their Runtime version.

    To upgrade your Data Lake and all Data Hubs to the latest service pack of their Runtime version, perform the following for your Data Lake and each Data Hub from the Upgrade UI each cluster:

    1. Select the upgrade target that matches your current Runtime version. If you are upgrading a Data Lake, you will see 7.2.16 (Runtime upgrade, OS; centos7); for Data Hubs you will see Select 7.2.16 (Runtime upgrade). Select this line and then run Validate and Prepare.
    2. Once Validate and Prepare completes, return to this page, select the same line again and run Upgrade:

      For Data Lakes, this will perform the OS image upgrade, so step 2 below is required for Data Hubs only.
  2. Upgrade the OS image on all Data Hubs.

    Additionally, you are required to upgrade the OS image on all Data Hubs:

    1. Open the Data Hub Upgrade UI again and select the upgrade target that matches your current Runtime version and run Validate and Prepare. Select 7.2.16 (OS upgrade, OS; centos7).

    2. Once Validate and Prepare completes, return to this page, select the same lineagain and run Upgrade.

USEFUL LINKS:

Step 2: Upgrade FreeIPA to RHEL 8

Before your Data Lake or Data Hub clusters can be upgraded to RHEL 8, you first need to upgrade your environment (FreeIPA) cluster.

Open the FreeIPA tab on your environment’s details page and select Upgrade. From the drop-down menu select FreeIPA (Latest, OS: redhat8) and run the upgrade:

USEFUL LINKS:

Step 3: Upgrade Data Lake and Data Hubs to the latest Runtime 7.2.17 and the latest RHEL 8 OS image

Upgrade your clusters to the latest 7.2.17 and to RHEL 8.

Perform the following steps for the Data Lake and then each Data Hub:

  1. Understand RHEL 8 requirements.

    Before performing the OS version change from CentOS 7 to RHEL 8 make sure you have understood the requirements mentioned in Upgrading from CentOS to RHEL.

  2. Upgrade to the latest 7.2.17 service pack if needed.

    To determine whether you need to perform a Runtime and OS upgrade, or only an OS upgrade, check whether you see the target version in the drop-down labeled 7.2.17 (OS upgrade, OS: redhat8). If you do, you only need an OS upgrade. and you can skip to step 3 below. If you do not see it, perform the following steps for a Runtime upgrade:
    1. If you are upgrading a Data Lake, select 7.2.17 (Runtime upgrade, OS: redhat8). For upgrading a Data Hub, select 7.2.17 (Runtime upgrade). Next, run Validate and Prepare.
    2. Once Validate and Prepare completes, return to this page, select the same line again, and run Upgrade.

      Data Lake UI

      Data Hub UI

      For Data Lakes this will perform the RHEL 8 upgrade as well.
  3. Upgrade all Data Hubs to RHEL.

    Open the Upgrade UI again and the 7.2.17 (OS upgrade, OS: redhat8) option will now be available in the drop-down menu.

    1. Select Select 7.2.17 (OS upgrade, OS: redhat8) and run Validate and Prepare.

    2. Once Validate and Prepare completes, return to this page, select Select 7.2.17 (OS upgrade, OS: redhat8) again and run Upgrade.

USEFUL LINKS:

Step 4: Resize Medium Duty Data Lake to Enterprise Data Lake

If using a Medium Duty Data Lake, you should resize it to Enterprise Data Lake.

Before you begin, note the following:

  • Prior to attempting the resize, ensure that all Data Hubs in your environment have been upgraded to Runtime 7.2.17 and your data services have been upgraded to the latest version available.

  • During the resize operation your cluster’s Cloudera Manager and Cloudera Runtime configuration will be updated to the most recent recommendations (performance tuning, etc). You will be required to re-apply these custom configurations after you have performed the resize to Enterprise Data lake.

  • The resize operation will perform an SDX backup and restore in the background. This involves writing and reading from cloud storage. If you have not yet used data lake backup, please make sure your environment’s storage permissions are configured correctly for these operations.

  • Ensure that all Data Hubs in your environment have been upgraded to Runtime 7.2.17 and your data services have been upgraded to the latest version available.

  • The resizing operation requires a downtime and should be performed during a maintenance window. No metadata changes may occur during the resizing, as these changes will no longer be present once the resizing operation completes (the previously backed up metadata is being restored). Suspend any operations that may result in any SDX metadata change during the resizing operation.

  • Data Hub clusters should be stopped before the resizing operation begins. For any cluster that cannot be stopped, stop all of the services on the Data Hub through the Cloudera Manager UI.

  • With Cloudera Data Flow 2.0 or lower, some flows must be re-created after a resizing operation.

  • Review Data Lake resizing: Prerequisites.

Follow the steps for Data Lake resizing. During this operation, the metadata maintained in your current Data Lake is automatically backed up, a new Enterprise Data Lake is created within the environment, and the metadata is automatically restored to this new cluster. As mentioned above, any custom cluster configuration that you previously made will need to be reapplied after the resize completes.

The maintenance window required for this operation depends on the size of your SDX metadata. When you open the Resize cluster UI, it will show you the estimated duration of the operation.

USEFUL LINKS:

Step 5: Upgrade Data Lake and Data Hubs from 7.2.17 to 7.2.18

Perform the following steps for your Data Lake and each Data Hub:

If you are upgrading a Data Lake, all Data Hubs in your environment should be stopped and Data Services should not be running workloads for the duration of the operation.

  1. Once Runtime 7.2.18 (Runtime upgrade, OS: redhat8) is available to select from the Upgrade UI, select it and run Validate and Prepare.

  2. Once Validate and Prepare completes, return to this page, select 7.2.18 (Runtime upgrade, OS: redhat8) again and run Upgrade.