Migration requirements - additional machine required

Project migration requires a third machine, such as user laptop or Bastion host, with connectivity to both CDSW and Cloudera AI.

Consider the followings:

  • The project is first downloaded from CDSW to an intermediate machine using the export command. Then, it is uploaded to the Cloudera AI Workbench using the import command. The utility uses the cdswctl client for login and the creation of an SSH session and tunnel.
  • The utility uses rsync command line utility tool for migration of project files through the SSH tunnel.
  • Project artifacts, such as models, jobs, and applications, are migrated using APIs.
  • Authentication is carried out using the API key provided during migration and only authorized users are allowed to migrate projects. The data in transit will remain encrypted as long as the workbench has https connections.

The additional machine must have the following configurations:

  • Unix like system (MacOS or Linux)
  • Connectivity to both the source and the target sytems
  • Rsync must be installed
  • Python version must be 3.10 or higher
  • Sufficient disk space is required to hold the project content. For stronger security, the disk, the file system, or both must be encrypted.
  • Custom CA certificates if required