Cloudera Data Science Workbench to CDP

How to migrate Cloudera Data Science Workbench (CDSW) data from CDH to CDP.

CDSW with CDH or HDP to CDSW with CDP Private Cloud Base 7.x

  1. Upgrade to the latest CDSW 1.7.x version.
  2. Follow the documented migration steps to move the CDSW artifacts.
Cloudera Data Science Workbench is supported on both CDH and CDP, so you can run your CDSW workloads on CDP without any additional data migration steps.

CDSW to CML on CDP Public Cloud – Option 1

Migrate individual projects:

  1. Enable a new CML workspace.
  2. Create new projects and use code migration via Git.
  3. The standard engine images should allow you to use your code as-is. If you created custom engine images, you must rebuild them.
  4. In order for data to be accessed through CML, it must be migrated to CDP Public Cloud using the Replication Manager for access thru CML. Be sure to update data access in your code to use the new locations.
  5. Use jobs and models as needed by recreating jobs and deploying models in the new cluster.

CDSW to CML on CDP Public Cloud – Option 2

Administrator-level cluster migration:

  1. Upgrade to the latest CDSW 1.7.x version.
  2. Create a CDSW backup.
  3. Create a new CML Workspace. Do not log in or create CML projects and sessions until the migration is complete.
  4. Import data from the backup for the DB, Project files, Livelog, S2I Registry, and the Git server .
  5. In order for data to be accessed through CML, it must be migrated to CDP Public Cloud using the Replication Manager for access thru CML. Be sure to update data access in your code to use the new locations.