Upgrading Cloudera AI Workbenches
This topic describes how to upgrade existing Cloudera AI Workbenches. Currently, only Cloudera users with both the MLAdmin role and the EnvironmentAdmin account role can create, upgrade, or remove workbenches.
Existing Cloudera AI Workbenches periodically should be upgraded. Upgrading the workbench upgrades the Cloudera AI software version to the current version, and may also upgrade cluster software. In case the underlying Kubernetes software must be upgraded, a warning banner displays, notifying you that you shall upgrade the workbench promptly.
- During an upgrade, any running models and applications shut down, but they automatically restart after the upgrade is complete.
- To upgrade Kubernetes, only use the upgrade method provided in Cloudera AI. Do not upgrade Kubernetes directly in the cloud console or through the CLI. Follow the instructions here to upgrade Kubernetes. If there is some error, then repeat the instructions. This applies to both Microsoft Azure and AWS.
- You should back up your workbench before starting the upgrade. For more information, see Backing up Cloudera AI Workbenches.
Azure Upgrade Requirements: NFS v4.1 Migration
Pre-Upgrade Procedure
- In the Azure portal, navigate to your NetApp volumes and confirm the protocol is set to NFSv4.1.
- If your volume is currently using NFSv3, follow the Azure NetApp Files conversion instructions to
migrate to v4.1. This requires registering the conversion feature using the Azure CloudShell:
Register-AzProviderFeature -ProviderNamespace Microsoft.NetApp -FeatureName ANFProtocolTypeNFSConversion - Manually update the projects-share PV in your cluster to include the
correct mount options. Use kubectl edit pv projects-share to ensure the
spec.mountOptions section matches the following:
spec: mountOptions: - hard - nfsvers=4.1 - nolock - If the upgrade was already attempted and pods (such as hdfscli-server,
s2i-client, shs, or ds-vfs) are stuck in a
ContainerCreatingorCrashLoopBackOffstate due to mount errors, delete the pods to force a restart once the PV has been updated.
When is an upgrade necessary?
Cloud service providers define their generally available version of Kubernetes based on their Kubernetes version support policies. For AKS refer to Supported Kubernetes versions in Azure Kubernetes Service (AKS) and for EKS refer to Amazon EKS Kubernetes release calendar.
Cloud service providers may have different deprecation policies for Kubernetes versions:
- For AWS deprecation policy, refer their FAQ section in Amazon EKS version support and FAQ.
- For Azure, refer to the Azure Kubernetes FAQ.
If any Kubernetes version used in your Cloudera AI Workbenches is deprecated by the cloud providers and Cloudera AI upgrades are enabled, the warning banner displays.
ACTION REQUIRED: A new Cloudera AI version is available and it is highly recommended to upgrade
to the latest version as soon as possible. To perform an upgrade, select Upgrade
Workbench from the Actions menu.
In order to avoid unplanned service interruption caused by the automatic Kubernetes upgrade by EKS and continue to receive support from AKS for your Cloudera AI Workbenches on Azure, it is important to make sure that your Cloudera AI Workbenches are using supported Kubernetes versions. Upgrading a Cloudera AI Workbench will automatically upgrade the Kubernetes to a supported version. We recommend our users to upgrade the Cloudera AI Workbenches promptly when the warning banner appears.
What type of upgrades does Cloudera AI Support?
In-place or side-by-side Cloudera AI upgrades
Cloudera AI automatically selects the optimal upgrade strategy (in-place or side-by-dide) to ensure compatibility. If the system initiates a side-by-side upgrade, a temporary environment is created to migrate your data. This process is automated and requires no manual intervention.
- In the Cloudera console, click
the Cloudera AI
tile.
The Cloudera AI Workbenches page displays.
- For a given workbench, click
from the
Actions menu and select Upgrade Workbench. - Click OK to confirm.
- If an upgrade fails, you can attempt to resolve the issue using the following
methods:
- Retry Upgrade (Primary): Available for both in-place and side-by-side upgrades. From the Actions menu, select Retry Upgrade Workbench. This resumes the process from the point of failure and is the recommended first step for all upgrade types.
- Rollback Mechanism (Optional): Available only for side-by-side upgrades. If you prefer to revert to your original environment rather than retrying the upgrade, follow the instructions in the Rollback Mechanism section below.
- After the side-by-side upgrade is complete, download a fresh
kubeconfigfile to ensure you can correctly access and manage the upgraded cluster.
Reverting a failed upgrade: The Rollback Mechanism
For side-by-side upgrades, Cloudera AI includes a built-in Rollback mechanism. If an upgrade encounters a critical error or the workbench status changes to Upgrade Failed, you can revert the environment to its original, pre-upgrade state.
cdp ml request-workflow-rollback --workspace-crn <workspace-crn>Upgrades through Cloudera AI backup & restore (Legacy)
If a Cloudera AI Workbench upgrade from a specific version could not be validated due to Kubernetes version deprecations on cloud providers or is deemed risky, in-place upgrades will be disabled for these versions.
In such cases, depending on the version of Cloudera AI either the upgrade button is disabled or the in-place upgrade pre-flight check will fail, with a failure message pops up that says:
In-place upgrades from <existing_version> are not supported. Follow the documentation for the backup based upgrade steps.
Here, <existing_version> is the version number of your workbench.
In this case, it is recommended to go with Backup/Restore to upgrade to the latest Cloudera AI version, essentially performing a workbench upgrade with all your previous data in place. Refer to Cloudera AI Upgrades using Backup/Restore for more information.
