Experiments - Concepts and Terminology

This topic walks you through some basic concepts and terminology related to experiments.

The term experiment refers to a non interactive batch execution script that is versioned across input parameters, project files, and output. Batch experiments are associated with a specific project (much like sessions or jobs) and have no notion of scheduling; they run at creation time. To support versioning of the project files and retain run-level artifacts and metadata, each experiment is executed in an isolated container.

Lifecycle of an Experiment

The rest of this section describes the different stages in the lifecycle of an experiment - from launch to completion.
  1. Launch Experiment

    In this step you will select a script from your project that will be run as part of the experiment, and the resources (memory/GPU) needed to run the experiment. The engine kernel will be selected by default based on your script. For detailed instructions on how to launch an experiment, see Getting Started with Cloudera Machine Learning.

  2. Build
    When you launch the experiment, Cloudera Machine Learning first builds a new versioned engine image where the experiment will be executed in isolation. This new engine includes:
    • the base engine image used by the project (check Project > Settings)
    • a snapshot of the project filesystem
    • environmental variables inherited from the project.
    • packages explicitly specified in the project's build script (cdsw-build.sh)

      It is your responsibility to provide the complete list of dependencies required for the experiment via the cdsw-build.sh file. As part of the engine's build process, Cloudera Machine Learning will run the cdsw-build.sh script and install the packages or libraries requested there on the new image.

    For details about the build process and examples on how to specify dependencies, see Engines for Experiments and Models.

  3. Schedule

    Once the engine is built the experiment is scheduled for execution like any other job or session. Once the requested CPU/GPU and memory have been allocated to the experiment, it will move on to the execution stage.

    Note that if your deployment is running low on memory and CPU, your runs may spend some time in this stage.

  4. Execute

    This is the stage where the script you have selected will be run in the newly built engine environment. This is the same output you would see if you had executed the script in a session in the Workbench console.

    You can watch the execution in progress in the individual run's Session tab.

    You can also go to the project Overview > Experiments page to see a table of all the experiments launched within that project and their current status.

    Run ID: A numeric ID that tracks all experiments launched on a Cloudera Machine Learning deployment. It is not limited to the scope of a single user or project.