Debugging Issues with Experiments
This topic lists some common issues to watch out for during an experiment's build and execution process.
- Experiment spends too long in Scheduling/Built stage
- If your experiments are spending too long in any particular stage, check the resource
consumption statistics for the cluster. When the cluster starts to run out of resources,
often experiments (and other entities like jobs, models) will spend too long in the queue
before they can be executed.
Resource consumption by experiments (and jobs, sessions) can be tracked by site administrators on the
page. - Experiment fails in the Build stage
- During the build stage Cloudera Data Science Workbench creates a new Docker image for
the experiment. You can track progress for this stage on each experiment's
Build page. The build logs on this page should help point you in
the right direction.
Common issues that might cause failures at this stage include:
- Lack of execute permissions on the build script itself.
- Inability to reach the Python package index or R mirror when installing packages.
- Typo in the name of the build script (
cdsw-build.sh
). Note that the build process will only run a script calledcdsw-build.sh
; not any other bash scripts from your project. - Using
pip3
to install packages incdsw-build.sh
, but selecting a Python 2 kernel when you actually launch the experiment. Or vice versa.
- Experiment fails in the Execute stage
-
Each experiment includes a Session page where you can track the output of the experiment as it executes. This is similar to the output you would see if you test the experiment in the workbench console. Any runtime errors will display on the Session page just as they would in an interactive session.