OS upgrade for Data Lake hosts

When important software and operating system changes are available for Data Lake hosts, you can initiate an upgrade process that will replace existing hosts with upgraded versions and reattach them to the Data Lake storage.

Overview

When an upgrade is available for Data Lake hosts, a CDP administrator has the option to apply the upgrade. Note that during the upgrade process, the Data Lake services are not available to the attached workload clusters. Therefore, before triggering a Data Lake upgrade, consider stopping any jobs running on your workload clusters and restarting them after the Data Lake is restored. Audits and metadata will continue to be queued for collection through the restoration process.

When your CDP administrator triggers node upgrades, the upgrade process:

  1. Detaches all non-ephemeral disks from the Data Lake nodes.
  2. Removes the nodes.
  3. Provisions new nodes of the same type with the upgraded software versions.
  4. Reattaches the disks to the new volumes.
  5. Reconnects services to the external database.
There are some limitations to upgrading:
  • Only allowed on instances where data is stored in external databases and persistent disks.
  • Must be the same flavor of OS, version of CM, and version of Cloudera Runtime parcels.
  • All attached Data Hubs or Data Hub clusters must be stopped (because services will be unavailable during upgrade).

You can perform manual upgrades from the CDP web UI or CLI.

Manual upgrade from web UI

Before you begin: If you wish to verify the upgrade after completion, go to Management Console > Data Lakes > Image Details > Image ID and note the current Image ID. After you upgrade, you can compare this to the new Image ID.

To perform manual upgrade from CDP web UI:
  1. Log in to the CDP web interface.
  2. Navigate to the desired Data Lake using Management Console > Data Lakes.
  3. In the Data Lake details page, click Actions > Check for Data Lake Upgrade:

  4. If an upgrade is available, click the Data lake upgrade available option.

    The upgrade process begins, and the Data Lake status will show Upgrade In Progress.

  5. To verify your upgrade was successful, check Management Console > Data Lakes > Image Details > Image ID and compare the old Image ID.

Manual upgrade from CLI

To perform manual repair from the CLI, use the following commands:
  • cdp datalake check-datalake-upgrade-options --datalake-name $name-dl - Check if your Data Lake OS can be upgraded.
  • cdp datalake upgrade-datalake --datalake-name $name-dl - Perform Data Lake OS upgrade.