Troubleshooting Issues with Workloads
This section describes some potential issues data scientists might encounter once the ML workspace is running workloads.
Engines cannot be scheduled due to lack of CPU or memory
A symptom of this is the following error message in the Workbench: "Unschedulable: No node in the cluster currently has enough CPU or memory to run the engine."
Either shut down some running sessions or jobs or provision more hosts for Cloudera Machine Learning.
Workbench prompt flashes red and does not take input
The Workbench prompt flashing red indicates that the session is not currently ready to take input.
Cloudera Machine Learning does not currently support non-REPL interaction. One workaround is to skip the prompt using appropriate command-line arguments. Otherwise, consider using the terminal to answer interactive prompts.
PySpark jobs fail due to Python version mismatch
Exception: Python in worker has different version 2.6 than that in driver 2.7, PySpark cannot run with different minor versions
One solution is to install the matching Python 2.7
version on all the cluster hosts. Another, more recommended solution is
to install the Anaconda parcel on all CDH cluster hosts. Cloudera Data
Science Workbench Python engines will use the version of Python included
in the Anaconda parcel which ensures Python versions between driver and
workers will always match. Any library paths in workloads sent from
drivers to workers will also match because Anaconda is present in the
same location across all hosts. Once the parcel has been installed, set
environment variable in the Cloudera Machine Learning Admin dashboard.