Identify your upgrade path
Refer to the following diagram for a high-level overview of the upgrade steps. Your starting point will vary, depending on the Runtime version and image version that your environment, Data Lake, and Cloudera Data Hub clusters are using.
The high-level upgrade steps are presented horizontally, from step 1 to step 5 and under each step you can see:
- Operations: A list of the actual operations that are performed as part of this step
- Result: The resulting state after these operations have been performed
For example, if your clusters are using Cloudera Runtime 7.2.16 and if Step 1 is applicable to your clusters, you would be performing a Data Lake upgrade and for each of your Cloudera Data Hub clusters a service pack upgrade and an OS upgrade in Step 1. If you started with Cloudera Runtime 7.2.16, after performing the operations in Step 1 you would have your Data Lake and all Cloudera Data Hub clusters upgraded to the latest service pack of Cloudera Runtime 7.2.16 and your cluster OS would be patched with up-to-date Python 3 binaries.
As presented in the diagram, you may be required to perform the following operations:
- Runtime service pack upgrades for your Data Lake and each Cloudera Data Hub cluster
- Major/Minor Runtime and OS upgrades for your Data Lake and each Cloudera Data Hub cluster
- CentOS to RHEL 8 upgrade for your FreeIPA, Data Lake, and each Cloudera Data Hub cluster
- Data Lake resize
For example, if your clusters are using Cloudera Runtime 7.2.16 or earlier and are impacted by Python 3.8 dependency described in TSB-664, you need to start at Step 1, but if this does not apply to you, you may start at a later step. Furthermore, depending on the operating system that your clusters are using, the CentOS to RHEL 8 upgrade steps (which are part of in Step 2) may not apply to you. and if you are not using Medium Duty Data Lake, you skip Enterprise Data Lake resize step (Step 4).