Web Applications Embedded in Sessions
This topic describes how Cloudera Machine Learning allows you to embed web applications for frameworks such as Spark 2, TensorFlow, Shiny, and so on within sessions and jobs.
Many data science libraries and processing frameworks include user interfaces to help track progress of your jobs and break down workflows. These are instrumental in debugging and using the platforms themselves. For example, TensorFlow visualizations can be run on TensorBoard. Other web application frameworks such as Shiny and Flask are popular ways for data scientists to display additional interactive analysis in the languages they already know.
Cloudera Machine Learning allows you to access these web UIs directly from sessions and jobs. This feature is particularly helpful when you want to monitor and track progress for batch jobs. Even though jobs don't give you access to the interactive workbench console, you can still track long running jobs through the UI. However, note that the UI is only active so long as the job or session is active. If your session times out after 60 minutes (default timeout value), so will the UI.
CDSW_APP_PORT
and CDSW_READONLY_PORT
are environment
variables that point to general purpose public ports. Any HTTP services running in containers
that bind to CDSW_APP_PORT
or CDSW_READONLY_PORT
are
available in browsers at:
http://<$CDSW_ENGINE_ID>.<$CDSW_DOMAIN>
. Therefore, TensorBoard,
Shiny, Flask or any other web framework accompanying a project can be accessed directly from
within a session or job, as long as it is run on CDSW_APP_PORT
or
CDSW_READONLY_PORT
.
CDSW_APP_PORT
is meant for applications that grant some level of control to
the project, such as access to the active session or terminal.
CDSW_READONLY_PORT
must be used for applications that grant read-only
access to project results.
To access the UI while you are in an active session, click the grid icon in the upper right hand corner of the Cloudera Machine Learning web application, and select the UI from the dropdown. For a job, navigate to the job overview page and click the History tab. Click on a job run to open the session output for the job. You can now click the grid icon in the upper right hand corner of the Cloudera Machine Learning web application to access the UI for this session.
Limitations with port availability
- one on
CDSW_APP_PORT
, which can be used for applications that grant some level of control over the project to Contributors and Admins, - one on
CDSW_READONLY_PORT
, which can be used for applications that only need to give read-only access to project collaborators, - and, one on the now-deprecated
CDSW_PUBLIC_PORT
, which is accessible by all users.
However, by default the editors feature runs third-party browser-based editors on
CDSW_APP_PORT
. Therefore, for projects that are already using
browser-based third-party editors, you are left with only 2 other ports to run applications
on: CDSW_READONLY_PORT
and CDSW_PUBLIC_PORT
. Keep in mind
the level of access you want to grant users when you are selecting one of these ports for a
web application.