Behavioral Changes

Learn about behavioral changes in Cloudera Machine Learning.

The minimum length for environment name has changed
When creating new CML Workspaces, the minimum for environment name length has changed from 4 to 5 characters. Workspaces may fail with the error Invalid Argument: Environment name must only include lowercase letters, numbers, and hyphens and must be between 5 and 28 characters.
commitProtocolClass optimised for S3 blobstore

CML Spark session generates a default spark-defaults.conf in /etc/spark/conf with the following commitProtocolClass optimised for Cloud blobstore, which however does not support dynamicPartitionOverwrite mode:

spark.sql.sources.commitProtocolClass=org.apache.spark.internal.io.cloud.PathOutputCommitProtocol
  • Create a file named spark-default.conf in the Project and add the following configurations:

    spark.sql.parquet.output.committer.class=org.apache.parquet.hadoop.ParquetOutputCommitter
    spark.sql.sources.commitProtocolClass=org.apache.spark.sql.execution.datasources.SQLHadoopMapReduceCommitProtocol