ML Runtimes Environment Variables List

The following table lists Cloudera Data Science Workbench environment variables that you can use to customize your project environments. These can be set either as a site administrator or within the scope of a project or a job.

Environment Variable Description
MAX_TEXT_LENGTH

Maximum number of characters that can be displayed in a single text cell. By default, this value is set to 800,000 and any more characters will be truncated.

Default: 800,000

SESSION_MAXIMUM_MINUTES

Maximum number of minutes a session can run before it times out.

Default: 60*24*7 minutes (7 days)

Maximum Value: 35,000 minutes

JOB_MAXIMUM_MINUTES

Maximum number of minutes a job can run before it times out.

Default: 60*24*7 minutes (7 days)

Maximum Value: 35,000 minutes

IDLE_MAXIMUM_MINUTES

Maximum number of minutes a session can remain idle before it exits.

Default: 60 minutes

Maximum Value: 35,000 minutes

Per-Engine Environmental Variables: In addition to the previous table, there are some more built-in environmental variables that are set by the Cloudera Data Science Workbench application itself and do not need to be modified by users. These variables are set per-engine launched by Cloudera Data Science Workbench and only apply within the scope of each engine.

Environment Variable Description
CDSW_PROJECT

The project to which this engine belongs.

CDSW_ENGINE_ID

The ID of this engine. For sessions, this appears in your browser's URL bar.

CDSW_MASTER_ID

If this engine is a worker, this is the CDSW_ENGINE_ID of its master.

CDSW_MASTER_IP

If this engine is a worker, this is the IP address of its master.

CDSW_PUBLIC_PORT

A port on which you can expose HTTP services in the engine to browsers. HTTP services that bind CDSW_PUBLIC_PORT will be available in browsers at: http(s)://<$CDSW_ENGINE_ID>.<$CDSW_DOMAIN>. By default, CDSW_PUBLIC_PORT is set to 8080.

A direct link to these web services will be available from the grid icon in the upper right corner of the Cloudera Data Science Workbench web application, as long as the job or session is still running. For more details, see Accessing Web User Interfaces from Cloudera Data Science Workbench.

In Cloudera Data Science Workbench, setting CDSW_PUBLIC_PORT to a non-default port number is not supported.

CDSW_APP_PORT

A port on which you can expose HTTP services in the engine to browsers. HTTP services that bind CDSW_APP_PORT will be available in browsers at: http(s)://<$CDSW_ENGINE_ID>.<$CDSW_DOMAIN>. Use this port for applications that grant some control to the project, such as access to the session or terminal.

A direct link to these web services will be available from the grid icon in the upper right corner of the Cloudera Machine Learning web application as long as the job or session runs. Even if the web UI does not have authentication, only Contributors and those with more access to the project can access it. For more details, see Accessing Web User Interfaces from Cloudera Data Science Workbench.

Note that if the Site Administrator has enabled Allow only session creators to run commands on active sessions, then the UI is only available to the session creator. Other users will not be able to access it.

Use 127.0.0.1 as the IP.

CDSW_READONLY_PORT

A port on which you can expose HTTP services in the engine to browsers. HTTP services that bind CDSW_READONLY_PORT will be available in browsers at: http(s)://<$CDSW_ENGINE_ID>.<$CDSW_DOMAIN>. Use this port for applications that grant read-only access to project results.

A direct link to these web services will be available to users with from the grid icon in the upper right corner of the Cloudera Data Science Workbench web application as long as the job or session runs. Even if the web UI does not have authentication, Viewers and those with more access to the project can access it. For more details, see Accessing Web User Interfaces from Cloudera Data Science Workbench.

Use 127.0.0.1 as the IP.

CDSW_DOMAIN

The domain on which Cloudera Data Science Workbench is being served. This can be useful for iframing services, as demonstrated in Accessing Web User Interfaces from Cloudera Data Science Workbench.

CDSW_CPU_MILLICORES

The number of CPU cores allocated to this engine, expressed in thousandths of a core.

CDSW_MEMORY_MB

The number of megabytes of memory allocated to this engine.

CDSW_IP_ADDRESS

Other engines in the Cloudera Data Science Workbench cluster can contact this engine on this IP address.