Backing up ML workspaces

Cloudera Machine Learning makes it easy to create machine learning projects, jobs, experiments, ML models, and applications in workspaces. The data and metadata of these artifacts are stored in different types of storage systems in private clouds or in external NFS-backed workspaces outside of a private cloud.

You can backup an ML workspace, and restore it later. The backup preserves all files, models, applications and other assets in the workspace (files are note backed up by CML automatically for external NFS-based workspaces). All workspace backups can be viewed in the Workspace Backup Catalog UI.

The Backup and Restore feature gives you the ability to backup all of your data (except files in external NFS-backed workspaces) to protect your machine learning artifacts against disasters. If your Cloudera Machine Learning workspace is backed up, this feature lets you restore the saved data so that you can recover your ML artifacts as they were saved in the desired backup. The Backup and Restore feature gives the administrator the ability to take “on-demand” backups of CML workspaces. Core services running in the workspace are shut down during the backup process to ensure consistency in the backup data. It is recommended that backups are taken during off-peak hours to minimize user impacts.

The time required to complete backing up a workspace depends on the amount of data to copy. The backup process copies data from both block volumes and internal NFS. In general, the time taken to backup NFS is the dominant factor. You should regularly back up CML workspaces.

The time to backup NFS is highly dependent on the amount of data, and on the nature and number of files. Based on the amount of data, you can set a timeout value while taking backup. You can view the status of ongoing/old backups on CML workspace UI and backup catalog UI.

There is currently no restriction on the number of backups one can take, and the backup snapshots are retained indefinitely in the underlying private cloud cluster as long as the original workspace (from which this backup was taken from) is not deleted.. CML workspace backup details are stored in the Workspace Backup Catalog UI in the CML control plane, and these entries may be listed, viewed, deleted or restored as desired.

Restoring a backup overwrites the existing CML workspace (from which this backup was taken from) wherein the restored data is automatically imported. All the projects, jobs, applications, etc., that were in existence during the backup are automatically available in the new workspace. Restoring a CML backup overwrites the existing workspace with a new one and then launches restore jobs to create storage volumes from the backup snapshots. The restore process takes longer than a regular workspace provisioning operation due to the extra work in copying data from backup to the new storage volumes. Restores are always full-copy restores. The time to restore is dominated by NFS restoration, which takes at least as long as the time to backup the file system. The restored workspace is always created with the latest CML software version, which may be different from the CML version of the original workspace that was backed up.