Impact of the migration on workload

Learn about potential impacts to your existing workload running on CDSW when migrating to Cloudera AI.

The following aspects can have impact on your workload:

Spark on Kubernetes and Spark Pushdown

By default, Cloudera AI runs Spark on Kubernetes which is very different from CDSW. Project owners or users can enable the Spark Pushdown feature on Cloudera AI project level which allows the Spark workload to run on the Cloudera Base on premises YARN. However, enabling this feature is optional and cannot be enforced.

The risk of not enabling the Spark Pushdown feature is that users might unintentionally run their Spark on Kubernetes and use up most of the cluster resources if no quotas are set.

End of Life (EOL) Python versions

Python 2.x, 3.6 and 3.7 have already reached EOL. The ML Runtimes for these Python versions are not supported anymore, but they still work and can be used in Cloudera AI. However, no security fixes for vulnerabilities are available in those Python versions.

Validation and Testing

Cloudera AI projects must be validated and tested before formally migrating them to Cloudera AI. Pay special attention to the following issues:
  • Projects that are switching to different images post migration, for example, from legacy to runtime

  • Spark pushdown requires additional settings to be added to the spark-defaults.conf file

Legacy engines are deprecated

While legacy engines can still be used, Cloudera recommends migrating to ML Runtimes as soon as possible.