Cloudera AI service with Data Lake upgrades

Cloudera Data Platform environments have two services which can be upgraded individually, the FreeIPA service and the Data Lake (DL) service. Cloudera AI Workbenches run in Cloudera Data Platform environments. The FreeIPA service provides identity management, and the Data Lake service provides SDX capabilities to Cloudera AI Workbenches.

In this document we provide FAQs for the behavior of Cloudera AI Workbenches during a Data Lake upgrade. For information on FreeIPA upgrades, see Upgrade FeeIPA.

What kinds of Data Lake upgrades are possible?

The Data Lake service supports the following upgrades.

  • Hotfix upgrades
  • Version upgrades
  • OS version upgrades

These upgrades can all be done from the Cloudera Data Platform Data Lake service UI, or with the Cloudera Data Platform CLI. DL upgrades require downtime. DL upgrades preserve the state of the data lake during the upgrade.

During DL upgrades, the shape of the data lake cannot change. For example, we cannot change a LIGHT_DUTY shape to a MEDIUM_DUTY_HA shape during the DL upgrade.

What is Data Lake migration and is it supported?

To change the DL shape, perform a DL migration, also called DL scaling. For example, a DL migration can change a DL LIGHT_DUTY shape to a MEDIUM_DUTY_HA shape.

DL migration is currently available as a technical preview. Speak to your Cloudera account team to discuss whether it may be suitable for your situation.

What happens if the Data Lake upgrade/migration fails?

There is no automated backup and restore process for the DL. It is recommended to perform a backup of the DL before starting the upgrade process. If the DL upgrade fails, the recommended option is to delete the failed DL using the Cloudera Data Platform CLI (cdp datalake delete-datalake --datalake-name <dl name>) and recreate the DL using the cdpcli. Once the DL is recreated, you need to restore the DL state from the backup. For more information, refer to Backup and Restore for the Data Lake.

Do not delete the environment service during a failed DL upgrade process. Deleting an environment causes all Cloudera AI Workbench running in this environment to be unusable.

Can the environment service be deleted and recreated at any point if the Data Lake upgrade or migration process has an error?

No, environments with experiences running inside them cannot be deleted at any time. If you delete the environment, then all the experiences (such as a Cloudera AI Workbench) need to be deleted.

Unless a Cloudera AI Workbench is first backed up and restored, then all state information is lost and you need to start from a fresh workbench. For more information, see Backing up Cloudera AI Workbenches.

What are the Cloudera AI Workbench prerequisites for Data Lake upgrades/migrations?

Do the following before upgrading or migrating a Data Lake.
  • Upgrade Cloudera AI Workbenches to the latest version (if an upgrade is available).
  • Stop any jobs, sessions, experiments or any workloads that need DL access before performing a DL upgrade.
  • Announce to the team that there will be planned downtime for Cloudera AI Workbenches during the DL upgrade process.

Are the Cloudera AI Workbenches operational during Data Lake upgrades/migrations?

It is recommended NOT to use Cloudera AI Workbenches during DL upgrades.

However, the observed behavior of Cloudera AI Workbenches during DL upgrades is as follows.

  • Cloudera AI Workbenches remain accessible during DL upgrades. Users can log in to a Cloudera AI Workbench.
  • Users can launch sessions, run jobs, experiments, models, and so on which do not require DL access. For example, jobs that do not require IDBroker or SDX/HMS access will function normally.
  • Any compute instance that requires IDBroker or SDX/HMS access will fail.
  • Any scheduled jobs that require IDBroker or SDX access will fail.
  • With regular DL upgrades and migrates, you should suspend Cloudera AI Workbenches. This require downtime.
  • With Zero Downtime Upgrade (ZDU) of DL, Cloudera AI Workbenches will be functional during upgrades. If the DL version is 7.2.17 or lower, you may encounter the CDPD-66549 known issue.

Are any changes to Cloudera AI Workbenches needed after a DL is upgraded or migrated successfully?

No further actions are required on a Cloudera AI Workbench after a successful DL upgrade. Cloudera AI Workbenches continue to function normally. Make sure to announce to the team that they can start using Cloudera AI Workbenches as usual.