Known Issues and Limitations in ML Runtimes version 2024.10.01

You might run into some known issues while using ML Runtimes 2024.10.01.

DSE-25143 Assembling plots in PBJ R Runtimes does not work

When trying to plot additional content on already existing plots, PBJ R Runtimes throw an error. Plots can only be created using the plot function.

DSE-32839 Extra configuration needed when using Spark in a PBJ Workbench-based R Runtime

When using Spark in R workloads that are running PBJ Workbench Runtimes, the environmental variable R_LIBS_USER must be passed to the Spark executors with the value set to "/home/cdsw/.local/lib/R/<R_VERSION>/library"

E.g. when using sparklyr with a PBJ Workbench R 4.3 Runtime, the correct way to set up a sparklyr connection is:
library(sparklyr)
config <- spark_config()
config$spark.executorEnv.R_LIBS_USER="/home/cdsw/.local/lib/R/4.3/library"
sc <- spark_connect(config = config)        

Packages can fail to load in a session

When installing R or Python packages in a Session, the kernel might not be able to load the package in the same session, if a previous version of the package or its newly installed dependencies have been loaded in the same Session. Such issues are observable more often in PBJ R Runtimes, which automatically load basic R packages like vctrs, lifecycle, rlang, cli at session startup.

Workaround: Start a new session, import and use the newly installed package there.

Version of jupyter-client Python package must be less than version 8 for PBJ Runtimes

Upgrading the Python package jupyter-client with a version greater than 7.4.9 can temporarily break a Project. Workloads using PBJ Runtimes will not be able to start Projects if the jupyter-client version is greater than 7.4.9.

Workaround: Launch the same version of Python, but not on a PBJ Runtime (either Workbench or JupyterLab). Open a Terminal window and uninstall the jupyter-client package from the Project by executing pip3 uninstall jupyter-client. Verify your change by running pip3 list and checking that the version of the jupyter-client package is less than version 8.

Adding a new ML Runtimes when using a custom root certificate might generate error messages

When trying to add new ML Runtimes, a number of error messages might appear in various places when using a custom root certificate. For example, you might see: "Could not fetch the image metadata" or "certificate signed by unknown authority". This is caused by the runtime-puller pods not having access to the custom root certificate that is in use.

Workaround:

  1. Create a directory at any location on the master node:

    For example:

    mkdir -p /certs/

  2. Copy the full server certificate chain into this folder. It is usually easier to create a single file with all of your certificates (server, intermediate(s), root):
    # copy all certificates into a single file: 
    cat server-cert.pem intermediate.pem root.pem > /certs/cert-chain.crt
  3. (Optional) If you are using a custom docker registry that has its own certificate, you need to copy this certificate chain into this same file:
    cat docker-registry-cert.pem >> /certs/cert-chain.crt
  4. Copy the global CA certificates into this new file:
    # cat /etc/ssl/certs/ca-bundle.crt >> /certs/cert-chain.crt
  5. Edit your deployment of runtime manager and add the new mount.

    Do not delete any existing objects.

    #kubectl edit deployment runtime-manager

  6. Under VolumeMounts, add the following lines.

    Note that the text is white-space sensitive - use spaces and not tabs.

    - mountPath: /etc/ssl/certs/ca-certificates.crt 
       name: mycert 
       subPath: cert-chain.crt #this should match the new file name created in step 4
                    

    Under Volumes add the following text in the same edit:

    - hostPath: 
       path: /certs/  #this needs to match the folder created in step 1
       type: "" 
    name: mycert
  7. Save your changes:

    wq!

    Once saved, you will receive the message "deployment.apps/runtime-manager edited" and the pod will be restarted with your new changes.

  8. To persist these changes across cluster restarts, use the following Knowledge Base article to create a kubernetes patch file for the runtime-manager deployment: https://community.cloudera.com/t5/Customer/Patching-CDSW-Kubernetes-deployments/ta-p/90241

Cloudera Bug: DSE-20530

Spark Runtime Add-on required for Spark 2 integration with Scala Runtimes

Scala Runtimes on Cloudera AI require Spark Runtime Addon to enable Spark2 integration. Spark3 is not supported with the Scala Runtime.