Known issues and limitations with the inbuilt CDSW migration tool

Review the known issues and limitations for the inbuilt CDSW migration tool.

Unsupported CDSW features

  • CDSW supports the Host Mount feature, howevere it is not supported in Cloudera AI.

    Workaround:

    Use custom Runtime addons as an alternative.

  • Migration is supported only from a workbench configured with LDAP authentication.
  • Custom configurations, such as Host, Kubernetes and DNS in the CDSW environment will not be copied by the migration tool. The configuration details must be identified in the migration planning, documented before the migration and must be copied manually after migration.

  • Migrating the sessions created with a custom engine in CDSW will not work in the migrated Cloudera AI as the engine architecture differs for Cloudera AI and CDSW. You must move from the custom engine to a custom ML Runtime before the migration.
  • If the Allow containers to run as root option is selected in CDSW > Administration > Security, and this workbench is migrated to Cloudera AI, the selected option will cause Spark job errors.

    This setting has been removed from Cloudera AI, therefore if the value is set in the migrated CDSW it can lead to undefined behaviour.

    Do not copy this setting when migrating to Cloudera AI.

    Workaround

    You can manually configure the Cloudera AI master node, and restart the relevant pods:

    1. Access the db-0 pod and db container in the migrated workbench from the Kubernetes dashboard by using shell, in the Cloudera Management Console.
    2. Run PostgreSQL.
    3. Run the following command:
      \c sense
    4. Run the following command:
      sense=# update site_config set pod_eval_allow_root=false;
      UPDATE 1
    5. Delete the two evaluator pods, so that the pod-evaluator component is restarted.
    6. Restart web pods.

Migration from CDSW to Cloudera AI using SAML

The migration from CDSW to Cloudera AI, using SAML is supported with the following limitations:

  • The NameID at the IdP level must be transformed to ensure it does not include special characters.

    Cloudera AI imposes a user identifier limitation that requires this specific format. This typically involves applying a transformation rule within your IdP to remove characters such as @, _, or ., or to encode them in a format that Cloudera AI can process correctly. For instance, a user's email address, such as john.doe@example.com, could be transformed into johndoeexamplecom or a similar format.

  • To configure group membership, you must map the attribute containing the user's groups to the specific object identifier (OID) urn:oid:1.3.6.1.4.1.5923.1.5.1.1, which is treated as the definitive attribute for group membership by both Cloudera AI and Cloudera.

    If you previously used a different OID, such as urn:oid:2.5.4.11 (intended for organizational unit name), to map groups in Cloudera Data Science Workbench (CDSW), you must update this mapping in your IdP's configuration to the new OID.

Migration strategy

The migration tool expects a running CDSW instance and on premises cluster side-by-side.

on premises platform

  • CDSW to Cloudera AI migration is supported only on Cloudera Embedded Container Service clusters.

  • The migration of CDSW to Cloudera AI is not supported for the Cloudera Embedded Container Service cluster installed with the internal registry alias option.
  • Migration to Cloudera AI on premises in OpenShift Container Platform (OCP) environment is not supported.

CDSW version

CDSW migration is supported only from CDSW version 1.10.0 and higher versions.

Engine support

The usage of legacy engine is deprecated from Cloudera Machine Learning on premises 1.5.1. Transform all your workloads to use ML Runtimes before the migration.

Custom configurations

Custom configurations, such as host or Kubernetes configurations, are not migrated. You must take notes of these configurations and configure your on premises cluster manually after migration.

CDSW Projects

  • CDSW Projects which access HBase might not work after migration.
  • CDSW projects that use engines with Spark might not work as expected after migration.

Folder path of migration

The CDSW to Cloudera AI migration tool host mounts the root folder from the Cloudera Embedded Container Service hosts and expects the docker binary to be present in the /opt/cloudera/parcels/ECS/docker/docker path. If any customization is made to this path in your environment, copy or softlink the correct docker to the correct path to unblock the migration.