Cloudera Data Science Workbench Command Line Reference

This topic describes the commands available to the Cloudera Data Science Workbench command line utility cdsw that exists within a Cloudera Data Science Workbench deployment. This utility is meant to manage your Cloudera Data Science Workbench cluster. Running cdsw without any arguments will print a brief description of each command.

In addition, there is a cdswctl CLI client that offers different functionality that is meant for use by data scientists to manage their sessions. For information about that see Cloudera Data Science Workbench Command Line Reference

Start, Stop, Restart for CSD Deployments: The commands available for a CSD-based deployment are only a subset of those available for an RPM deployment. For example, the CLI for CSD deployments does not have commands such as cdsw start, stop, and restart available. Instead, these actions must be executed through the Cloudera Data Science Workbench service in Cloudera Manager. For instructions, see Starting, Stopping, and Restarting the Service.

All of the following commands can be used in an RPM-based deployment. Those available for CSD-based deployments have been marked in the table.
Command CSD Description and Usage
cdsw start

Initializes and bootstraps the master host. Use this command to start Cloudera Data Science Workbench.

cdsw stop

De-registers, resets, and stops a host.

On a worker host, this command will remove the worker from the cluster.

On the master host, this command will bring down the application and effectively tear down the Cloudera Data Workbench deployment.

cdsw restart

Run on the master host to restart application components.

To restart a worker host, use cdsw stop, followed by cdsw join. These commands have been explained further in this topic.

cdsw join

Initializes a worker host. After a worker host has been added, run this command on the worker host to add it to the Cloudera Data Science Workbench cluster.

This registers the worker hosts with the master, and increases the available pool of resources for workloads.

cdsw status

Displays the current status of the application.

Starting with version 1.4, you can use cdsw status -v or cdsw status --verbose for more detailed output.

The cdsw status command is not supported on worker hosts.

cdsw validate

Performs diagnostic checks for common errors that might be preventing the application from running as expected.

This command should typically be run as the first step to troubleshooting any problems with the application, as indicated by cdsw status.

cdsw logs

Creates a tarball with diagnostic information for your Cloudera Data Science Workbench installation.

If you file a case with Cloudera Support, run this command on each host and attach the resulting bundle to the case.

cdsw version

Displays the version number and type of Cloudera Data Science Workbench deployment (RPM or CSD).

cdsw help

Displays the inline help options for the Cloudera Data Science Workbench CLI.

cdswctl login Enables you to log into the cdswctl client.
cdswctl projects list Lists the projects.
cdswctl models create Creates a model with the specified parameters.
cdswctl models list Lists all models.

You can refine the search by specifying the modelId.

cdswctl models listBuild Llists the builds for a model.

You can monitor the status of the build by specifying the modelId and the projectId.

cdswctl models listDeployments List the deployments for a model.

You can refine the search by specifying the modelId.

Use the statusSet parameter to check the status of the model being deployed

.
cdswctl models deploy Deploys a model with the specified parameters.
cdswctl models listReplicas Enables you to view the list of model replicas.

You also need this information to obtain replica logs

.
cdswctl models getReplicaLogs Enables you to view the logs for a model replica.
cdswctl models restart

Restarts a model.

Usage:

cdswctl models restart --modelDeploymentId=<deployment_ID>

Note: Running this command does not change the resources if you previously ran the cdswctl models update command.

cdswctl models update Changes the name, description, or visibility of the model.

To change a model’s resources, use the cdswctl models deploy command.

cdswctl models delete Deletes a model.

Usage:

cdswctl models delete --id=<model_ID>