Identify your upgrade path

Refer to the following diagram for a high-level overview of the upgrade steps. Your starting point will vary, depending on the Runtime version and image version that your environment, Data Lake, and Data Hubs are using.

The high-level upgrade steps are presented horizontally, from step 1 to step 5 and under each step you can see:

  • Operations: A list of the actual operations that are performed as part of this step
  • Result: The resulting state after these operations have been performed

For example, if your clusters are using Runtime 7.2.16 and if Step 1 is applicable to your clusters, you would be performing a Data Lake upgrade and for each of your Data Hubs a service pack upgrade and an OS upgrade in Step 1. If you started with Runtime 7.2.16, after performing the operations in Step 1 you would have your Data Lake and all Data Hubs upgraded to the latest service pack of Runtime 7.2.16 and your cluster OS would be patched with up-to-date Python 3 binaries.

As presented in the diagram, you may be required to perform the following operations:

  • Runtime service pack upgrades for your Data Lake and each Data Hub
  • Major/Minor Runtime and OS upgrades for your Data Lake and each Data Hub
  • CentOS to RHEL 8 upgrade for your FreeIPA, Data Lake, and each Data Hub
  • Data Lake resize

For example, if your clusters are using Runtime 7.2.16 or earlier and are impacted by Python 3.8 dependency described in TSB-664, you need to start at Step 1, but if this does not apply to you, you may start at a later step. Furthermore, depending on the operating system that your clusters are using, the CentOS to RHEL 8 upgrade steps (which are part of in Step 2) may not apply to you. and if you are not using Medium Duty Data Lake, you skip Enterprise Data Lake resize step (Step 4).