Runtimes
Known ML Runtimes issues.
- Code completion in the Workbench editor does not work when using Runtimes. It does work in the console on the right hand side, and it does work in both the editor and console when using Legacy Engines.
- Limitations for Spark support for ML Runtimes:
- ML Runtimes running Python 3.8 kernel does not support running Spark.
- ML Runtimes on CDH 7.x does not support running Spark.
Cloudera Bug: DSE-13916
-
ML Runtimes do not support the CLI interface (
cdswctl
command) for the current release.Cloudera Bug: DSE-13699
- In order to use Spark with ML Runtimes on Cloudera Data Science Workbench, prior to
using ML Runtimes the first time, you must install
py4j
. As part of the Session, run the following:run pip install py4j
- Jupyter Notebook sessions in legacy engine:8-engine:13 do not exit after
IDLE_MAXIMUM_MINUTES
of inactivity. They will run untilSESSION_MAXIMUM_MINUTES
(which is seven days by default). .Workaround:You can change the configuration of your cluster to apply the fix for this issue. Change the editor command for Jupyter Notebook in every engine that uses it to the following:
NOTEBOOK_TIMEOUT_SECONDS=$(python3 -c "print(${IDLE_MAXIMUM_MINUTES}*60)") /usr/local/bin/jupyter notebook --no-browser --ip=127.0.0.1 --port=${CDSW_APP_PORT} --NotebookApp.token= --NotebookApp.allow_remote_access=True --NotebookApp.quit_button=False --log-level=ERROR --NotebookApp.shutdown_no_activity_timeout=300 --MappingKernelManager.cull_idle_timeout=${NOTEBOOK_TIMEOUT_SECONDS} -- TerminalManager.cull_inactive_timeout=${NOTEBOOK_TIMEOUT_SECONDS} --MappingKernelManager.cull_interval=60 --TerminalManager.cull_interval=60 --MappingKernelManager.cull_connected=True
This does the following:- Kills each running notebook after IDLE_MAXIMUM_MINUTES of inactivity
- Kills the CDSW/CML session in which Jupyter is running after 5 minutes with no notebooks
Cloudera Bug: DSE-13741, DSE-6651