Creating a Project with ML Runtimes variants

Projects create an independent working environment to hold your code, configuration, and libraries for your analysis. This topic describes how to create a project with ML Runtimes variants in Cloudera Machine Learning.

  1. Go to Cloudera Machine Learning and on the left sidebar, click Projects.
  2. Click New Project.
  3. If you are a member of a team, from the drop-down menu, select the Account under which you want to create this project. If there is only one account on the deployment, you will not see this option.
  4. Enter a Project Name.
  5. Select Project Visibility from one of the following options.
    • Private - Only project collaborators can view or edit the project.
    • Team - If the project is created under a team account, all members of the team can view the project. Only explicitly-added collaborators can edit the project.
    • Public - All authenticated users of Cloudera Machine Learning will be able to view the project. Collaborators will be able to edit the project.
  6. Under Initial Setup, you can either create a blank project, or select one of the following sources for your project files.
    • Blank - The project will contain no information from a template, local file, or Git.

    • Templates - Template projects contain example code that can help you get started with Cloudera Machine Learning. They are available in R, Python, PySpark, and Scala. Using a template project is not required, but it helps you start using Cloudera Machine Learning right away.
    • Local - If you have an existing project on your local disk, use this option to upload compressed files or folders to Cloudera Machine Learning.

    • Git - If you already use Git for version control and collaboration, you can continue to do so with Cloudera Machine Learning. Specifying a Git URL will clone the project into Cloudera Machine Learning. To use a password-protected Git repository, see Creating a project from a password-protected Git repo.

  7. If you would like to configure which Runtimes are available for this particular project, complete the following:
    • Scroll down to the Runtimes section and enter the appropriate information:
      • Note that the following Runtimes are configured by default:

        Runtime 1.

        • Kernel: The highest version of the Python kernel
        • Editor: PBJ Workbench and Jupyterlab
        • Edition: Standard & NVIDIA GPU
        • Version: The latest Runtime version

        Runtime 2.

        • Kernel: The highest version of the R kernel
        • Editor: PBJ Workbench
        • Edition: Standard
        • Version: The latest Runtime version
      • Use the Advanced view to add ML Runtimes based on a more detailed Editor, Kernel, Edition, and Version criteria.
      • Runtimes with the Enabled status and highest maintenance version will be configured as default settings for project creation as follows:
        • Upon new installation, runtimes will be calculated from variants recommended by Cloudera.
        • In cases other than new installations, variants that are set as default on the Runtime Catalog.
  8. Click Create Project. After the project is created, you can see your project files and the list of jobs defined in your project.
    Note that as part of the project filesystem, Cloudera Machine Learning also creates the following .gitignore file.
    R
    node_modules
    *.pyc
    .*
    !.gitignore
  9. Set or verify the ML Runtimes settings for the project.

    Within the selected project, you can modify the default engine configuration:

    1. In the left navigation bar, click Project Settings.
    2. Select the Runtime/Engine tab.
    3. Next to Default Engine, select ML Runtimes.
    4. Click Save Engine.

Once switched to ML Runtimes, all modules (Sessions, Jobs, Models, etc.) for the project will configure using ML Runtimes instead of Legacy Engines. Setting Engine parameters for the project will no longer be possible.

Existing and running instances (for example, Jobs, Models or Applications) previously configured with a particular Engine configuration will keep their configuration until you change the related settings. If you want to change the engine configuration for existing and running instances, you will need to update those based on the new, Runtime-based settings.

To create a project with ML Runtimes, follow this example:
project_body = cmlapi.CreateProjectRequest(
      name = "project_name",
      description = "project_description",
      default_project_engine_type = "ml_runtime",
      project_body.visibility = "public", # or "private" or "organization"
      template = "Python")

You also need to specify a runtime_identifier if this is used with an ml_runtime project. Obtain a list the runtimes with the following command:

client.list_runtimes()

For some more examples of commands related to projects, see: Using the Projects API.