Homepage
/
Data Science Workbench
1.9.2
Search Documentation
▶︎
Cloudera
Reference Architectures
▶︎
Cloudera Public Cloud
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
DataFlow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
DataFlow for Data Hub
Runtime
▶︎
Cloudera Private Cloud
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime
Upgrade
Storage
Flow Management
Streaming Analytics
Flow Management Operator
Streaming Analytics Operator
Streams Messaging Operator
▶︎
Cloudera Manager
Cloudera Manager
▼
Applications
Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability
Observability on premises
Workload XM On-Prem
▶︎
Legacy
Cloudera Enterprise
Flow Management
Stream Processing
HDP
HDF
Streams Messaging Manager
Streams Replication Manager
▶︎
Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability
Observability on premises
Workload XM On-Prem
«
Filter topics
Data Science Workbench
▶︎
Release Notes
▶︎
What's New
▶︎
Cloudera Data Science Workbench 1.9.2
New Features and Changes in Cloudera Data Science Workbench 1.9.2
Issues Fixed in Cloudera Data Science Workbench 1.9.2
▶︎
Cloudera Data Science Workbench 1.9.1
New Features and Changes in Cloudera Data Science Workbench 1.9.1
Issues Fixed in Cloudera Data Science Workbench 1.9.1
▶︎
Cloudera Data Science Workbench 1.9.0
New Features and Changes in Cloudera Data Science Workbench 1.9.0
Issues Fixed in Cloudera Data Science Workbench 1.9.0
▶︎
Older Releases
▶︎
Cloudera Data Science Workbench 1.8.1
New Features and Changes in Cloudera Data Science Workbench 1.8.1
Issues Fixed in Cloudera Data Science Workbench 1.8.1
▶︎
Cloudera Data Science Workbench 1.8.0
New Features and Changes in Cloudera Data Science Workbench 1.8.0
Issues Fixed in Cloudera Data Science Workbench 1.8.0
▶︎
Cloudera Data Science Workbench 1.7.2
New Features and Changes in Cloudera Data Science Workbench 1.7.2
Issues Fixed in Cloudera Data Science Workbench 1.7.2
▶︎
Cloudera Data Science Workbench 1.7.1
New Features and Changes in Cloudera Data Science Workbench 1.7.1
Engine Upgrade 1.7.1
Issues Fixed in Cloudera Data Science Workbench 1.7.1
▶︎
Cloudera Data Science Workbench 1.6.1
New Features and Changes in Cloudera Data Science Workbench 1.6.1
Engine Upgrade 1.6.1
Issues Fixed in Cloudera Data Science Workbench 1.6.1
▶︎
Cloudera Data Science Workbench 1.6.0
New Features and Changes in Cloudera Data Science Workbench 1.6.0
Engine Upgrade 1.6.0
Incompatible Changes in Cloudera Data Science Workbench 1.6.0
Issues Fixed in Cloudera Data Science Workbench 1.6.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.6.0
▶︎
Cloudera Data Science Workbench 1.5.0
New Features and Changes in Cloudera Data Science Workbench 1.5.0
Engine Upgrade 1.5.0
Incompatible Changes in Cloudera Data Science Workbench 1.5.0
Issues Fixed in Cloudera Data Science Workbench 1.5.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.5.0
▶︎
Cloudera Data Science Workbench 1.4.3
New Features and Changes in Cloudera Data Science Workbench 1.4.3
Issues Fixed in Cloudera Data Science Workbench 1.4.3
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.3
▶︎
Cloudera Data Science Workbench 1.4.2
New Features and Changes in Cloudera Data Science Workbench 1.4.2
Engine Upgrade 1.4.2
Issues Fixed in Cloudera Data Science Workbench 1.4.2
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.2
▶︎
Cloudera Data Science Workbench 1.4.0
New Features in Cloudera Data Science Workbench 1.4.0
Engine Upgrade 1.4.0
Incompatible Changes in Cloudera Data Science Workbench 1.4.0
Issues Fixed in Cloudera Data Science Workbench 1.4.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.0
▶︎
Cloudera Data Science Workbench 1.3.1
New Features in Cloudera Data Science Workbench 1.3.1
Issues Fixed in Cloudera Data Science Workbench 1.3.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.3.1
▶︎
Cloudera Data Science Workbench 1.3.0
New Features and Changes in Cloudera Data Science Workbench 1.3.0
Issues Fixed in Cloudera Data Science Workbench 1.3.0
Incompatible Changes in Cloudera Data Science Workbench 1.3.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.3.0
▶︎
Cloudera Data Science Workbench 1.2.2
New Features and Changes in Cloudera Data Science Workbench 1.2.2
Engine Upgrade 1.2.2
Issues Fixed In Cloudera Data Science Workbench 1.2.2
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.2
▶︎
Cloudera Data Science Workbench 1.2.1
Issues Fixed In Cloudera Data Science Workbench 1.2.1
Incompatible Changes in Cloudera Data Science Workbench 1.2.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.1
▶︎
Cloudera Data Science Workbench 1.2.0
New Features and Changes in Cloudera Data Science Workbench 1.2.0
Engine Upgrade 1.2.0
Issues Fixed in Cloudera Data Science Workbench 1.2.0
Incompatible Changes in Cloudera Data Science Workbench 1.2.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.0
▶︎
Cloudera Data Science Workbench 1.1.1
New Features in Cloudera Data Science Workbench 1.1.1
Issues Fixed In Cloudera Data Science Workbench 1.1.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.1.1
▶︎
Cloudera Data Science Workbench 1.1.0
New Features and Changes in Cloudera Data Science Workbench 1.1.0
Engine Upgrade 1.1.0
Issues Fixed in Cloudera Data Science Workbench 1.1.0
Incompatible Changes in Cloudera Data Science Workbench 1.1.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.1.0
▶︎
Cloudera Data Science Workbench 1.0.1
Issues Fixed in Cloudera Data Science Workbench 1.0.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.0.x
Cloudera Data Science Workbench 1.0.0
▶︎
Known Issues and Limitations in Cloudera Data Science Workbench 1.9.2
Installation
Upgrades
CDH Integration
Cloudera Manager Integration
Apache Spark
Crashes and stops responding
Third-party Editors
Cloudera Data Science Workbench Engines
Runtimes
Custom Legacy Engine Images
Experiments
GPU Support
Jobs
Models
Applications
Platform
Networking
Security
Usability
General
▶︎
ML Runtimes Release Notes
▶︎
ML Runtimes What's New
ML Runtimes Version 2021.09.02
ML Runtimes Version 2021.09
ML Runtimes Version 2021.06
ML Runtimes Version 2020.04
ML Runtimes Version 2021.02
ML Runtimes Version 2020.11
ML Runtimes Known Issues and Limitations
▶︎
ML Runtimes Pre-installed Packages
▶︎
Pre-Installed Packages in ML Runtimes
▶︎
ML Runtimes 2022.04
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.12
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.09
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.6 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.06
Python 3.8.6 Libraries for Workbench
Python 3.7.9 Libraries for Workbench
Python 3.6.12 Libraries for Workbench
Python 3.8.6 Libraries for JupyterLab
Python 3.7.9 Libraries for JupyterLab
Python 3.6.12 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.04
RAPIDS Runtime PIP Python 3.7.8 Libraries for Workbench
RAPIDS Runtime PIP Python 3.8.6 Libraries for Workbench
RAPIDS Runtime PIP Python 3.7.8 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.8.6 Libraries for JupyterLab
▶︎
ML Runtimes 2021.02
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.6 Libraries for Workbench
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
ML Runtimes 2020.11
▶︎
Product Overview
Cloudera Machine Learning Overview
Key Differences - CML vs. CDSW
▶︎
Planning
▶︎
Architecture Overview
▶︎
Architecture Overview
▶︎
Cloudera Manager
Master Host
Worker Hosts
▶︎
Cloudera Data Science Workbench Engines
Docker and Kubernetes
Cloudera Data Science Workbench Web Application
CDS 2.x Powered by Apache Spark
▶︎
Requirements and Supported Platforms
Cloudera Manager and CDH Requirements
Operating System Requirements
JDK Requirements
▶︎
Networking and Security Requirements
Ports Required by Cloudera Data Science Workbench
GPU Support
Recommended Hardware Configuration
Python Supported Versions
Docker and Kubernetes Support
Supported Browsers
Cloudera Altus Director Support (AWS and Azure Only)
Recommended Configuration on Amazon Web Services (AWS)
Recommended Configuration on Microsoft Azure
▶︎
Installation & Upgrade
▶︎
Installation
▶︎
Download and Install the Cloudera Data Science Workbench
CDSW 1.9.2 Download Information
▶︎
Installing Cloudera Data Science Workbench on CDP
Installing Cloudera Data Science Workbench 1.9.2
Multiple Cloudera Data Science Workbench Deployments
Airgapped Installations
▶︎
Required Pre-Installation Steps
Set Up a Wildcard DNS Subdomain
Disable Untrusted SSH Access
▶︎
Configure Block Devices
Docker Block Device
Application Block Device or Mount Point
▶︎
Installing Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Prerequisites
▶︎
Configure Apache Spark 2
Configure Apache Spark 2 on CDH 5
Configure Apache Spark 2 on CDH 6 or CDP Data Center 7
Configure JAVA_HOME
Download and Install the Cloudera Data Science Workbench CSD
Install the Cloudera Data Science Workbench Parcel
Add the Cloudera Data Science Workbench Service
Create the Administrator Account
Next Steps
▶︎
Installing Cloudera Data Science Workbench 1.9.2 Using Packages
Prerequisites
Configure Gateway Hosts Using Cloudera Manager
Install Cloudera Data Science Workbench on the Master Host
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
Create the Administrator Account
Next Steps
▶︎
CSD Installation on CDH
▶︎
Installing Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Prerequisites
▶︎
Configure Apache Spark 2
Configure Apache Spark 2 on CDH 5
Configure Apache Spark 2 on CDH 6 or CDP Data Center 7
Configure JAVA_HOME
Download and Install the Cloudera Data Science Workbench CSD
Install the Cloudera Data Science Workbench Parcel
Add the Cloudera Data Science Workbench Service
Create the Administrator Account
Next Steps
▶︎
Deploy CDSW on HDP
Overview of Deploying CDSW1.9.2 on HDP Overview
CDSW-on-HDP Architecture Overview
▶︎
Supported Platforms and Requirements
Platform Requirements
Operating System Requirements
Java Requirements
Network and Security Requirements
Hardware Requirements
Python Supported Versions
Known Issues and Limitations
▶︎
Installing Cloudera Data Science Workbench 1.9.2 on HDP
Prerequisites
Add Gateway Hosts for Cloudera Data Science Workbench to Your HDP Cluster
Create HDFS User Directories
Install Cloudera Data Science Workbench on the Master Host
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
Create the Site Administrator Account
Upgrading to Cloudera Data Science Workbench 1.9.2 on HDP
Getting Started with a New Project on Cloudera Data Science Workbench
Upgrading a CDSW 1.9.2 Deployment from HDP 2 to HDP 3
Frequently Asked Questions (FAQs)
▶︎
RPM Installation on CDH
▶︎
Installing Cloudera Data Science Workbench 1.9.2 Using Packages
Prerequisites
Configure Gateway Hosts Using Cloudera Manager
Install Cloudera Data Science Workbench on the Master Host
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
Create the Administrator Account
Next Steps
▶︎
Upgrade
▶︎
Upgrading to the Latest Version of Cloudera Data Science Workbench on CDH
Upgrading Cloudera Data Science Workbench 1.7.2 or higher from CDH 6 to CSP Private Cloud Base 7.x
Upgrading Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Upgrading CSD Deployments from CDH 5 to CDH 6
Upgrading RPM Deployments from CDH 5 to CDH 6
Migrating from an RPM-based Deployment to the Latest 1.9.2 CSD
Upgrading Cloudera Data Science Workbench Using Packages
▼
How To
▶︎
Quickstart
▶︎
Getting Started with Cloudera Data Science Workbench
Sign up
Create a Project from a Built-in Template
Launch a Session to Run the Project
Export Session List
Next Steps
▶︎
User Guide
▶︎
Managing Cloudera Data Science Workbench Users
User Contexts
User Roles
Managing your Personal Account
▶︎
Managing Team Accounts
Creating a Team Without an Associated LDAP Group
Modifying Team Account Settings
▶︎
Managing Users as a Site Administrator
Accessing the Site Administrator Dashboard
Adding New Users
Creating a Team With an Associated LDAP Group
Assigning the Site Administrator Role to an Existing User
Disabling User Accounts
Viewing Licensed Users
Monitoring Users
▼
Projects
▶︎
Projects
▶︎
Managing Projects in Cloudera Data Science Workbench
Creating a Project with Legacy Engine Variants
Creating a Project with ML Runtimes Variants
Adding Collaborators
Modifying Project Settings
▶︎
Managing Files
Disabling Project File Uploads and Downloads
Custom Template Projects
Deleting a Project
▼
Workbench
▼
Using the Workbench
Start a New Session
▶︎
Run Code
Code Autocomplete
Project Code Files
Access the Terminal
Stop a Session
▶︎
Jupyter Magic Commands
Python
Scala
▶︎
Visualize Report
▶︎
Data Visualization
Simple Plots
Saved Images
HTML Visualizations
IFrame Visualizations
Grid Displays
Documenting Your Analysis
Cloudera Data Visualization for ML
▶︎
Embedded Web Apps
▶︎
Web Applications Embedded in Cloudera Data Science Workbench
Spark 2 Web UIs (CDSW_SPARK_PORT)
▶︎
TensorBoard, Shiny, and others (CDSW_APP_PORT or CDSW_READONLY_PORT)
Limitations with Port Availability
Example: A Shiny Application
▶︎
Web UI
▶︎
Accessing Web User Interfaces from Cloudera Data Science Workbench
Cloudera Manager, Hue, and the Spark History Server
▶︎
Distributed ML
▶︎
Running Distributed ML Workloads on YARN
Example: H2O
▶︎
Parallel Computing
▶︎
Distributed Computing with Workers
▶︎
Workers API
Launch Workers
List Workers
Await Workers
Stop Workers
Example: Worker Network Communications
▶︎
Collaborate
▶︎
Collaborating on Projects with Cloudera Data Science Workbench
▶︎
Project Collaborators
Restricting Collaborator and Administrator Access to Active Sessions
Teams
Sharing Personal Projects
Forking Projects
Collaborating with Git
▶︎
Sharing Job and Session Console Outputs
Sharing Data Visualizations
▶︎
Using Git to Collaborate on Projects
Importing a Project From Git
Linking an Existing Project to a Git Remote
▶︎
Editors
▶︎
Editors
▶︎
Configure a Browser IDE as an Editor
Test a Browser IDE in a Session Before Installation
Configure a Browser IDE at the Project Level
Configure a Browser IDE at the Engine Level
Configure Jupyter Notebook in a Customized Engine Image
▶︎
Configure a SSH Gateway to Use Local IDEs
Configure and Use a Local IDE
▶︎
Configure PyCharm as a Local IDE
Download cdswctl and Add an SSH Key
Initialize an SSH Connection to Cloudera Data Science Workbench
Add Cloudera Data Science Workbench as an Interpreter for PyCharm
(Optional) Configure the Sync Between Cloudera Data Science Workbench and Pycharm
▶︎
Data Access
▶︎
Importing Data into Cloudera Data Science Workbench
Accessing Local Data from Your Computer
Accessing Data from HDFS
▶︎
Accessing Data from Apache HBase
Load Data into HBase Table
Query Data Using HappyBase
Accessing Data from Apache Hive
▶︎
Accessing Data from Apache Impala
Loading CSV Data into an Impala Table
Running Queries on Impala Tables
Accessing Data in Amazon S3 Buckets
▶︎
Accessing External SQL Databases
R
Python
▶︎
Experiments
▶︎
Experiments
Purpose
Concepts
Running an Experiment (Quick Start)
Tracking Metrics
Saving Files
Disabling the Experiments Feature
Limitations
Debugging Issues with Experiments
▶︎
Models
▶︎
Models
Introduction to Production Machine Learning
Concepts and Terminology
Creating and Deploying a Model (QuickStart)
Calling a Model
▶︎
Updating Active Models
Re-deploy an Existing Build
Deploy a New Build for a Model
Stop a Model
Restart a Model
▶︎
Securing Models using Model API Key
Enabling Authentication
Generating a Model API Key
Managing Model API Keys
Enabling Model Metrics
Tracking Model Metrics
▶︎
Usage Guidelines
Model Code
Model Artifacts
Resouce Consumption and Scaling
Security Considerations
Deployment Considerations
▶︎
Model Training and Deployment - Iris Dataset
Create a Project
Train the Model
Deploy the Model
▶︎
Model Monitoring and Administration
Monitoring Individual Models
Monitoring All Active Models
Deleting a Model
Disabling the Models Feature
▶︎
Debugging Issues with Models
Building
Pushing
Deploying
Deployed
▶︎
Model Governance
Enabling Model Governance
Viewing lineage for a model deployment in Atlas
Registering training data lineage using a linking file
▶︎
Analytical Applications
▶︎
Analytical Applications
Testing Applications Before You Deploy
Application Limitations
▶︎
Jobs and Pipelines
▶︎
Managing Jobs and Pipelines in Cloudera Data Science Workbench
Creating a Job
Creating a Pipeline
Viewing Job History
▶︎
Cloudera Data Science Workbench Jobs API
API Key Authentication
▶︎
Starting a Job Run Using the API
Setting Environmental Variables
Sample Job Run
Starting a Job Run Using Python
▶︎
Runtimes
▶︎
Managing ML Runtimes
ML Runtimes versus the Legacy Engine
Managing Resource Profiles
▶︎
ML Runtimes Nvidia GPU Edition
Testing ML Runtime GPU Setup
ML Runtimes NVIDIA RAPIDS Edition
▶︎
Using Editors for ML Runtimes
▶︎
Using Jupyter with ML Runtimes
Installing a Jupyter extension
Installing a Jupyter kernel
Installing Additional ML Runtimes Packages
Upgrading R and Python Packages
▶︎
ML Runtimes Environment Variables
ML Runtimes Environment Variables List
Accessing Environmental Variables from Projects
▶︎
Pre-Installed Packages in ML Runtimes
▶︎
ML Runtimes 2022.04
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.12
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.09
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.6 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.06
Python 3.8.6 Libraries for Workbench
Python 3.7.9 Libraries for Workbench
Python 3.6.12 Libraries for Workbench
Python 3.8.6 Libraries for JupyterLab
Python 3.7.9 Libraries for JupyterLab
Python 3.6.12 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.04
RAPIDS Runtime PIP Python 3.7.8 Libraries for Workbench
RAPIDS Runtime PIP Python 3.8.6 Libraries for Workbench
RAPIDS Runtime PIP Python 3.7.8 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.8.6 Libraries for JupyterLab
▶︎
ML Runtimes 2021.02
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.6 Libraries for Workbench
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
R 4.0 Libraries
R 3.6 Libraries
ML Runtimes 2020.11
▶︎
Engines
▶︎
Engines
▶︎
Cloudera Data Science Workbench Engines
Basic Concepts and Terminology
ML Runtimes versus Legacy Engines
▶︎
Project Environments
Environmental Variables
Dependencies
Configuring Engine Environments for Experiments and Models
▶︎
Models and Experiments
▶︎
Engines for Experiments and Models
Snapshot Code
Build Image
Run Experiment / Deploy Model
▶︎
Engine Configuration
▶︎
Configuring Cloudera Data Science Workbench Engines
Concepts and Terminology
▶︎
Managing Engines
Managing Resource Profiles
Managing Engine Images
Configuring the Engine Environment
▶︎
Engine Dependencies
▶︎
Managing Engine Dependencies
Installing Packages Directly Within Projects
Creating a Customized Engine with the Required Package(s)
Mounting Additional Dependencies from the Host
Managing Dependencies for Spark 2 Projects
▶︎
Environment Variables
▶︎
Engine Environment Variables
Environment Variables from Cloudera Manager
Accessing Environmental Variables from Projects
Engine Environment Variables
▶︎
Package Libraries
▶︎
Installing Additional Packages
(Python Only) Using a Requirements File
▶︎
Using Conda with Cloudera Data Science Workbench
Creating an Extensible Engine with Conda
▶︎
Extensible Engines
▶︎
Customized Engine Images
▶︎
Creating a Customized Engine Image
Create a Dockerfile for the New Custom Image
Configure a Browser IDE at the Engine Level
Add Docker Registry Credentials
Build the New Image
▶︎
Distribute the Image
Push the Image to a Public Registry such as DockerHub
Push the Image to Your Company's Docker Registry
Distribute the Image Manually
Whitelist the Image in Cloudera Data Science Workbench
End-to-End Example: MeCab
Engine Limitations
Related Resources
▶︎
Engines Packaging
▶︎
Cloudera Data Science Workbench Engine Versions and Packaging
▶︎
Base Engine 13
Python Libraries in Base Engine 13
R Libraries in Base Engine 13
▶︎
Base Engine 10
Python Libraries in Base Engine 10
R Libraries in Base Engine 10
Scala in Base Engine 10
▶︎
Base Engine 8
Python Libraries in Base Engine 8
R Libraries in Base Engine 8
Scala in Base Engine 8
▶︎
Base Engine 7
Python Libraries in Base Engine 7
R Libraries in Base Engine 7
Scala in Base Engine 7
▶︎
Base Engine 6
Python Libraries in Base Engine 6
R Libraries in Base Engine 6
Scala in Base Engine 6
▶︎
Base Engine 5
Python Libraries in Base Engine 5
R Libraries in Base Engine 5
Scala in Base Engine 5
▶︎
Base Engine 4
Python Libraries in Base Engine 4
R Libraries in Base Engine 4
Scala in Base Engine 4
▶︎
Base Engine 3
Python Libraries in Base Engine 3
R Libraries in Base Engine 3
Scala in Base Engine 3
▶︎
Base Engine 2
Python 2 Libraries in Base Engine 2
R Libraries in Base Engine 2
Scala in Base Engine 2
▶︎
GPUs
▶︎
Using NVIDIA GPUs for Cloudera Data Science Workbench Projects
Key Points to Note
▶︎
Enabling Cloudera Data Science Workbench to use GPUs
Set Up the Operating System and Kernel
Install the NVIDIA Driver on GPU Hosts
▶︎
Enable GPU Support in Cloudera Data Science Workbench
CSD Deployments
RPM Deployments
Test whether Cloudera Data Science Workbench can Detect GPUs
▶︎
Using GPUs with Legacy Engines-Technical Preview
Create a Custom CUDA-capable Engine Image
Site Admins: Add the Custom CUDA Engine to your Cloudera Data Science Workbench Deployment
Project Admins: Enable the CUDA Engine for your Project
Test the CUDA Engine
Testing ML Runtime GPU Setup
▶︎
Spark Configuration
Using CDS 2.x Powered by Apache Spark
▶︎
Configuring CDS 2.x Powered by Apache Spark 2
▶︎
Spark Configuration Files
Configuring Global Properties Using Cloudera Manager
Configuring Spark Environment Variables Using Cloudera Manager
Managing Memory Available for Spark Drivers
Managing Dependencies for Spark 2 Jobs
Spark Logging Configuration
Running Spark Jobs on an HDP Cluster
Setting Up an HTTP Proxy for Spark 2
▶︎
Using Spark 2 from Python
Setting Up a PySpark Project
Spark on ML Runtimes
Example: Montecarlo Estimation
Example: Locating and Adding JARs to Spark 2 Configuration
Example: Distributing Dependencies on a PySpark Cluster
▶︎
Using Spark 2 from R
Installing sparklyr
Connecting to Spark 2
▶︎
Using Spark 2 from Scala
Accessing Spark 2 from the Scala Engine
Example: Read Files from the Cluster Local Filesystem
▶︎
Example: Using External Packages by Adding Jars or Dependencies
Adding Remote Packages
Adding Remote or Local JARs
▶︎
Site Administration
▶︎
Monitoring
▶︎
Monitoring Cloudera Data Science Workbench Usage
Related Resources
Monitoring User Events
Tracked User Events
▶︎
Quotas
Enabling Default Quotas for all CDSW Users
Enabling Custom Quotas for Specific Users
Modifying Default and Custom Quotas
▶︎
CDSW in Cloudera Manager
▶︎
Managing the Cloudera Data Science Workbench Service in Cloudera Manager
Adding the Cloudera Data Science Workbench Service
Roles Associated with the Cloudera Data Science Workbench Service
Accessing Cloudera Data Science Workbench from Cloudera Manager
Configuring Cloudera Data Science Workbench Properties
Starting, Stopping, and Restarting the Service
Checking the Status of the CDSW Service
Managing Cloudera Data Science Workbench Worker Hosts
Health Tests
▶︎
Tracking Disk Usage on the Application Block Device
Create a Chart to Track Disk Usage on the Application Block Device
Create a Trigger to Notify Cluster Administrators when Free Space Runs Low
Creating Diagnostic Bundles
▶︎
Data Collection
▶︎
Data Collection in Cloudera Data Science Workbench
▶︎
Usage Tracking
Disable Usage Tracking
▶︎
Diagnostic Bundles
Using Cloudera Manager
Disabling Usage Metrics
Using the Command Line
Information Collected in Diagnostic Bundles
▶︎
Managing CDSW
Cloudera Data Science Workbench Email Notifications
▶︎
Managing License Keys for Cloudera Data Science Workbench
Trial License
Cloudera Enterprise License
Uploading License Keys
User Access to Features
Adding CDSW Session Metadata Information
Web session timeouts
▶︎
Cluster Management
▶︎
Cluster Management
Cluster Management
▶︎
Cluster Monitoring with Grafana
Accessing the Grafana Dashboard
Available Grafana Dashboards
▶︎
Backup and Disaster Recovery for Cloudera Data Science Workbench
Creating a Backup for Disaster Recovery for Cloudera Data Science Workbench
Monitoring/Reducing CDSW disk space
Cloudera Data Science Workbench Scaling Guidelines
Ports Used By Cloudera Data Science Workbench
Rollback Cloudera Data Science Workbench to an Older Version
Uninstalling CPM Deployments
Uninstalling RPM Deployments
▶︎
Managing Hosts
▶︎
Managing Cloudera Data Science Workbench Hosts
▶︎
Customize Workload Scheduling
Labeling Auxiliary Hosts for CSD Deployments
Labeling Auxiliary Hosts for RPM Deployments
▶︎
Reserving the Master Host for Internal CDSW Components
Reserving the Master Host for CSD Deployments
Reserving the Master Host for RPM Deployments
▶︎
Adding and Removing Worker Hosts
Adding a Worker Host Using Cloudera Manager
Adding a Worker Host Using Packages
Removing a Worker Host Using Cloudera Manager
Removing a Worker Host Using Packages
Changing the Domain Name Using Cloudera Manager
Changing the Domain Name Using Packages
▶︎
Migrating a Deployment to New Hosts
▶︎
Migrating a Deployment to a New Set of Hosts
▶︎
Migrating a CSD Deployment
Add and Set Up the New Hosts
Copy the JDK to the new host
Copy the DNS Nameserver to the new host
Copy the Kerberos Configurations
Stop the CDSW Service
Backup Application Data
Delete CDSW Roles from Existing Hosts
Move Backup to the New Master
Update DNS Records for the New Master
Add Role Instances for the New Hosts
Run the Prepare Node command on the New Hosts
Start the CDSW Service
▶︎
Migrating an RPM Deployment
Add and Set Up the New Hosts
Copy the JDK to the new host
Copy the DNS Nameserver to the New Host
Copy the Kerberos Configurations
Stop Cloudera Data Science Workbench
Backup Application Data
Remove Cloudera Data Science Workbench from Existing Hosts
Move Backup to New Master
Update DNS Records for the New Master
Install Cloudera Data Science Workbench on New Master Host
▶︎
Security
▶︎
Security
▶︎
Cloudera Data Science Workbench Security Guide
▶︎
Security Model
Wildcard DNS Subdomain Requirement
Authentication
▶︎
Authorization
Cluster Authorization
User Role Authorization
Access Control for Teams and Projects
▶︎
Wire Encryption
External Communications
Internal Communications
Cloudera Data Science Workbench Gateway Host Security
Base Engine Image Security
▶︎
TLS and SSL Encryption
▶︎
Enabling TLS/SSL for Cloudera Data Science Workbench
Internal Termination
External Termination
▶︎
Private Key and Certificate Requirements
Creating a Certificate Signing Request (CSR) and Key/Certificate Pair
▶︎
Configuring Internal Termination
CSD Deployments
RPM Deployments
▶︎
Configuring External Termination
CSD Deployments
RPM Deployments
Configuring Custom Root CA Certificate
▶︎
Proxy Configuration
▶︎
Configuring Cloudera Data Science Workbench Deployments Behind a Proxy
Supporing a TLS-Enabled Proxy Server
Configuring Hostnames to be Skipped from the Proxy
▶︎
Hadoop Authentication - Kerberos
▶︎
Hadoop Authentication with Kerberos for Cloudera Data Science Workbench
UI Behavior for Non-Kerberized Clusters
Limitations
Configure FreeIPA
▶︎
External Authentication - LDAP and SAML
▶︎
Configuring External Authentication with LDAP and SAML
User Sign up Process
▶︎
Configuring LDAP/Active Directory Authentication
LDAP General Settings
LDAP Over SSL (LDAPS)
LDAP Group Settings
How Login Works with LDAP Group Settings Enabled
Test LDAP Configuration
▶︎
Configuring SAML Authentication
Configuration Options
How Login Works with SAML Group Settings Enabled
Debug Login URL
▶︎
HTTP Headers
▶︎
Configuring HTTP Headers for Cloudera Data Science Workbench
Enable HTTP Security Headers
Enable HTTP Strict Transport Security (HSTS)
Cross-Origin Resource Sharing (CORS)
▶︎
User-Controlled Kubernetes Pod
▶︎
Restricting User-Controlled Kubernetes Pods
Allow containers to run as root
Allow "privileged" pod containers
Allow pod containers to mount unsupported volume types
▶︎
SSH Keys
▶︎
SSH Keys
Personal Key
Team Key
Adding SSH Key to GitHub
Creating SSH Tunnels
▶︎
Troubleshooting
▶︎
Troubleshooting Cloudera Data Science Workbench
Understanding Installation Warnings
GPU Issues
Error Encountered Trying to Load Images when Initializing Cloudera Data Science Workbench
404 Not Found Error
Troubleshooting Kerberos Errors
Troubleshooting TLS/SSL Errors
Troubleshooting Issues with Workloads
Troubleshooting Issues with Models and Experiments
▶︎
CLI Reference
▶︎
Command Line Reference
Additional Usage Notes
cdswctl Command Line Interface Client
Download and Configure the cdswctl
(Optional) Generate an SSH Public/Private Key
Download cdswctl and Add an SSH Key
Initialize an SSH Connection to Cloudera Data Science Workbench
Log in to cdswctl
Prepare to manage models using the model CLI
Create a model using the CLI
View replica logs for a model using the CLI
▶︎
FAQs
▶︎
Cloudera Data Science Workbench FAQs
Where can I get a sample project to try out Cloudera Data Science Workbench?
What are the software and hardware requirements for Cloudera Data Science Workbench?
Can I run Cloudera Data Science Workbench on hosts shared with other Hadoop services?
How does Cloudera Data Science Workbench use Docker and Kubernetes?
Can I run Cloudera Data Science Workbench on my own Kubernetes cluster?
Does Cloudera Data Science Workbench support REST API access?
How do I contact Cloudera for issues regarding Cloudera Data Science Workbench?
▶︎
Glossary
Cloudera Data Science Workbench Glossary
▶︎
Videos
▶︎
Cloudera Data Science Workbench Videos
Models Demo
Experiments Demo
Quickstart Demo
(Optional) Configure the Sync Between Cloudera Data Science Workbench and Pycharm
(Optional) Generate an SSH Public/Private Key
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
(Optional) Install Cloudera Data Science Workbench on Worker Hosts
(Python Only) Using a Requirements File
404 Not Found Error
Access Control for Teams and Projects
Access the Terminal
Accessing Cloudera Data Science Workbench from Cloudera Manager
Accessing Data from Apache HBase
Accessing Data from Apache Hive
Accessing Data from Apache Impala
Accessing Data from HDFS
Accessing Data in Amazon S3 Buckets
Accessing Environmental Variables from Projects
Accessing Environmental Variables from Projects
Accessing External SQL Databases
Accessing Local Data from Your Computer
Accessing Spark 2 from the Scala Engine
Accessing the Grafana Dashboard
Accessing the Site Administrator Dashboard
Accessing Web User Interfaces from Cloudera Data Science Workbench
Add and Set Up the New Hosts
Add and Set Up the New Hosts
Add Cloudera Data Science Workbench as an Interpreter for PyCharm
Add Docker Registry Credentials
Add Gateway Hosts for Cloudera Data Science Workbench to Your HDP Cluster
Add Role Instances for the New Hosts
Add the Cloudera Data Science Workbench Service
Add the Cloudera Data Science Workbench Service
Adding a Worker Host Using Cloudera Manager
Adding a Worker Host Using Packages
Adding and Removing Worker Hosts
Adding CDSW Session Metadata Information
Adding Collaborators
Adding New Users
Adding Remote or Local JARs
Adding Remote Packages
Adding SSH Key to GitHub
Adding the Cloudera Data Science Workbench Service
Additional Usage Notes
Airgapped Installations
Allow "privileged" pod containers
Allow containers to run as root
Allow pod containers to mount unsupported volume types
Analytical Applications
Analytical Applications
Apache Spark
API Key Authentication
Application Block Device or Mount Point
Application Limitations
Applications
Architecture Overview
Architecture Overview
Assigning the Site Administrator Role to an Existing User
Authentication
Authorization
Available Grafana Dashboards
Await Workers
Backup and Disaster Recovery for Cloudera Data Science Workbench
Backup Application Data
Backup Application Data
Base Engine 10
Base Engine 13
Base Engine 2
Base Engine 3
Base Engine 4
Base Engine 5
Base Engine 6
Base Engine 7
Base Engine 8
Base Engine Image Security
Basic Concepts and Terminology
Build Image
Build the New Image
Building
Calling a Model
Can I run Cloudera Data Science Workbench on hosts shared with other Hadoop services?
Can I run Cloudera Data Science Workbench on my own Kubernetes cluster?
CDH Integration
CDS 2.x Powered by Apache Spark
CDSW 1.9.2 Download Information
CDSW in Cloudera Manager
CDSW-on-HDP Architecture Overview
cdswctl Command Line Interface Client
Changing the Domain Name Using Cloudera Manager
Changing the Domain Name Using Packages
Checking the Status of the CDSW Service
CLI Reference
Cloudera Altus Director Support (AWS and Azure Only)
Cloudera Data Science Workbench 1.0.0
Cloudera Data Science Workbench 1.0.1
Cloudera Data Science Workbench 1.1.0
Cloudera Data Science Workbench 1.1.1
Cloudera Data Science Workbench 1.2.0
Cloudera Data Science Workbench 1.2.1
Cloudera Data Science Workbench 1.2.2
Cloudera Data Science Workbench 1.3.0
Cloudera Data Science Workbench 1.3.1
Cloudera Data Science Workbench 1.4.0
Cloudera Data Science Workbench 1.4.2
Cloudera Data Science Workbench 1.4.3
Cloudera Data Science Workbench 1.5.0
Cloudera Data Science Workbench 1.6.0
Cloudera Data Science Workbench 1.6.1
Cloudera Data Science Workbench 1.7.1
Cloudera Data Science Workbench 1.7.2
Cloudera Data Science Workbench 1.8.0
Cloudera Data Science Workbench 1.8.1
Cloudera Data Science Workbench 1.9.0
Cloudera Data Science Workbench 1.9.1
Cloudera Data Science Workbench 1.9.2
Cloudera Data Science Workbench Email Notifications
Cloudera Data Science Workbench Engine Versions and Packaging
Cloudera Data Science Workbench Engines
Cloudera Data Science Workbench Engines
Cloudera Data Science Workbench Engines
Cloudera Data Science Workbench FAQs
Cloudera Data Science Workbench Gateway Host Security
Cloudera Data Science Workbench Glossary
Cloudera Data Science Workbench Jobs API
Cloudera Data Science Workbench Scaling Guidelines
Cloudera Data Science Workbench Security Guide
Cloudera Data Science Workbench Videos
Cloudera Data Science Workbench Web Application
Cloudera Data Visualization for ML
Cloudera Enterprise License
Cloudera Machine Learning Overview
Cloudera Manager
Cloudera Manager and CDH Requirements
Cloudera Manager Integration
Cloudera Manager, Hue, and the Spark History Server
Cluster Authorization
Cluster Management
Cluster Management
Cluster Management
Cluster Monitoring with Grafana
Code Autocomplete
Collaborate
Collaborating on Projects with Cloudera Data Science Workbench
Collaborating with Git
Command Line Reference
Concepts
Concepts and Terminology
Concepts and Terminology
Configuration Options
Configure a Browser IDE as an Editor
Configure a Browser IDE at the Engine Level
Configure a Browser IDE at the Engine Level
Configure a Browser IDE at the Project Level
Configure a SSH Gateway to Use Local IDEs
Configure and Use a Local IDE
Configure Apache Spark 2
Configure Apache Spark 2
Configure Apache Spark 2 on CDH 5
Configure Apache Spark 2 on CDH 5
Configure Apache Spark 2 on CDH 6 or CDP Data Center 7
Configure Apache Spark 2 on CDH 6 or CDP Data Center 7
Configure Block Devices
Configure FreeIPA
Configure Gateway Hosts Using Cloudera Manager
Configure Gateway Hosts Using Cloudera Manager
Configure JAVA_HOME
Configure JAVA_HOME
Configure Jupyter Notebook in a Customized Engine Image
Configure PyCharm as a Local IDE
Configuring CDS 2.x Powered by Apache Spark 2
Configuring Cloudera Data Science Workbench Deployments Behind a Proxy
Configuring Cloudera Data Science Workbench Engines
Configuring Cloudera Data Science Workbench Properties
Configuring Custom Root CA Certificate
Configuring Engine Environments for Experiments and Models
Configuring External Authentication with LDAP and SAML
Configuring External Termination
Configuring Global Properties Using Cloudera Manager
Configuring Hostnames to be Skipped from the Proxy
Configuring HTTP Headers for Cloudera Data Science Workbench
Configuring Internal Termination
Configuring LDAP/Active Directory Authentication
Configuring SAML Authentication
Configuring Spark Environment Variables Using Cloudera Manager
Configuring the Engine Environment
Connecting to Spark 2
Copy the DNS Nameserver to the new host
Copy the DNS Nameserver to the New Host
Copy the JDK to the new host
Copy the JDK to the new host
Copy the Kerberos Configurations
Copy the Kerberos Configurations
Crashes and stops responding
Create a Chart to Track Disk Usage on the Application Block Device
Create a Custom CUDA-capable Engine Image
Create a Dockerfile for the New Custom Image
Create a model using the CLI
Create a Project
Create a Project from a Built-in Template
Create a Trigger to Notify Cluster Administrators when Free Space Runs Low
Create HDFS User Directories
Create the Administrator Account
Create the Administrator Account
Create the Administrator Account
Create the Administrator Account
Create the Site Administrator Account
Creating a Backup for Disaster Recovery for Cloudera Data Science Workbench
Creating a Certificate Signing Request (CSR) and Key/Certificate Pair
Creating a Customized Engine Image
Creating a Customized Engine with the Required Package(s)
Creating a Job
Creating a Pipeline
Creating a Project with Legacy Engine Variants
Creating a Project with ML Runtimes Variants
Creating a Team With an Associated LDAP Group
Creating a Team Without an Associated LDAP Group
Creating an Extensible Engine with Conda
Creating and Deploying a Model (QuickStart)
Creating Diagnostic Bundles
Creating SSH Tunnels
Cross-Origin Resource Sharing (CORS)
CSD Deployments
CSD Deployments
CSD Deployments
CSD Installation on CDH
Custom Legacy Engine Images
Custom Template Projects
Customize Workload Scheduling
Customized Engine Images
Data Access
Data Collection
Data Collection in Cloudera Data Science Workbench
Data Science Workbench
Data Visualization
Debug Login URL
Debugging Issues with Experiments
Debugging Issues with Models
Delete CDSW Roles from Existing Hosts
Deleting a Model
Deleting a Project
Dependencies
Deploy a New Build for a Model
Deploy CDSW on HDP
Deploy the Model
Deployed
Deploying
Deployment Considerations
Diagnostic Bundles
Disable Untrusted SSH Access
Disable Usage Tracking
Disabling Project File Uploads and Downloads
Disabling the Experiments Feature
Disabling the Models Feature
Disabling Usage Metrics
Disabling User Accounts
Distribute the Image
Distribute the Image Manually
Distributed Computing with Workers
Distributed ML
Docker and Kubernetes
Docker and Kubernetes Support
Docker Block Device
Documenting Your Analysis
Does Cloudera Data Science Workbench support REST API access?
Download and Configure the cdswctl
Download and Install the Cloudera Data Science Workbench
Download and Install the Cloudera Data Science Workbench CSD
Download and Install the Cloudera Data Science Workbench CSD
Download cdswctl and Add an SSH Key
Download cdswctl and Add an SSH Key
Editors
Editors
Embedded Web Apps
Enable GPU Support in Cloudera Data Science Workbench
Enable HTTP Security Headers
Enable HTTP Strict Transport Security (HSTS)
Enabling Authentication
Enabling Cloudera Data Science Workbench to use GPUs
Enabling Custom Quotas for Specific Users
Enabling Default Quotas for all CDSW Users
Enabling Model Governance
Enabling Model Metrics
Enabling TLS/SSL for Cloudera Data Science Workbench
End-to-End Example: MeCab
Engine Configuration
Engine Dependencies
Engine Environment Variables
Engine Environment Variables
Engine Limitations
Engine Upgrade 1.1.0
Engine Upgrade 1.2.0
Engine Upgrade 1.2.2
Engine Upgrade 1.4.0
Engine Upgrade 1.4.2
Engine Upgrade 1.5.0
Engine Upgrade 1.6.0
Engine Upgrade 1.6.1
Engine Upgrade 1.7.1
Engines
Engines
Engines for Experiments and Models
Engines Packaging
Environment Variables
Environment Variables from Cloudera Manager
Environmental Variables
Error Encountered Trying to Load Images when Initializing Cloudera Data Science Workbench
Example: A Shiny Application
Example: Distributing Dependencies on a PySpark Cluster
Example: H2O
Example: Locating and Adding JARs to Spark 2 Configuration
Example: Montecarlo Estimation
Example: Read Files from the Cluster Local Filesystem
Example: Using External Packages by Adding Jars or Dependencies
Example: Worker Network Communications
Experiments
Experiments
Experiments
Experiments Demo
Export Session List
Extensible Engines
External Authentication - LDAP and SAML
External Communications
External Termination
FAQs
Forking Projects
Frequently Asked Questions (FAQs)
General
Generating a Model API Key
Getting Started with a New Project on Cloudera Data Science Workbench
Getting Started with Cloudera Data Science Workbench
Glossary
GPU Issues
GPU Support
GPU Support
GPUs
Grid Displays
Hadoop Authentication - Kerberos
Hadoop Authentication with Kerberos for Cloudera Data Science Workbench
Hardware Requirements
Health Tests
How do I contact Cloudera for issues regarding Cloudera Data Science Workbench?
How does Cloudera Data Science Workbench use Docker and Kubernetes?
How Login Works with LDAP Group Settings Enabled
How Login Works with SAML Group Settings Enabled
HTML Visualizations
HTTP Headers
IFrame Visualizations
Importing a Project From Git
Importing Data into Cloudera Data Science Workbench
Incompatible Changes in Cloudera Data Science Workbench 1.1.0
Incompatible Changes in Cloudera Data Science Workbench 1.2.0
Incompatible Changes in Cloudera Data Science Workbench 1.2.1
Incompatible Changes in Cloudera Data Science Workbench 1.3.0
Incompatible Changes in Cloudera Data Science Workbench 1.4.0
Incompatible Changes in Cloudera Data Science Workbench 1.5.0
Incompatible Changes in Cloudera Data Science Workbench 1.6.0
Information Collected in Diagnostic Bundles
Initialize an SSH Connection to Cloudera Data Science Workbench
Initialize an SSH Connection to Cloudera Data Science Workbench
Install Cloudera Data Science Workbench on New Master Host
Install Cloudera Data Science Workbench on the Master Host
Install Cloudera Data Science Workbench on the Master Host
Install Cloudera Data Science Workbench on the Master Host
Install the Cloudera Data Science Workbench Parcel
Install the Cloudera Data Science Workbench Parcel
Install the NVIDIA Driver on GPU Hosts
Installation
Installation
Installing a Jupyter extension
Installing a Jupyter kernel
Installing Additional ML Runtimes Packages
Installing Additional Packages
Installing Cloudera Data Science Workbench 1.9.2
Installing Cloudera Data Science Workbench 1.9.2 on HDP
Installing Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Installing Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Installing Cloudera Data Science Workbench 1.9.2 Using Packages
Installing Cloudera Data Science Workbench 1.9.2 Using Packages
Installing Cloudera Data Science Workbench on CDP
Installing Packages Directly Within Projects
Installing sparklyr
Internal Communications
Internal Termination
Introduction to Production Machine Learning
Issues Fixed in Cloudera Data Science Workbench 1.0.1
Issues Fixed in Cloudera Data Science Workbench 1.1.0
Issues Fixed In Cloudera Data Science Workbench 1.1.1
Issues Fixed in Cloudera Data Science Workbench 1.2.0
Issues Fixed In Cloudera Data Science Workbench 1.2.1
Issues Fixed In Cloudera Data Science Workbench 1.2.2
Issues Fixed in Cloudera Data Science Workbench 1.3.0
Issues Fixed in Cloudera Data Science Workbench 1.3.1
Issues Fixed in Cloudera Data Science Workbench 1.4.0
Issues Fixed in Cloudera Data Science Workbench 1.4.2
Issues Fixed in Cloudera Data Science Workbench 1.4.3
Issues Fixed in Cloudera Data Science Workbench 1.5.0
Issues Fixed in Cloudera Data Science Workbench 1.6.0
Issues Fixed in Cloudera Data Science Workbench 1.6.1
Issues Fixed in Cloudera Data Science Workbench 1.7.1
Issues Fixed in Cloudera Data Science Workbench 1.7.2
Issues Fixed in Cloudera Data Science Workbench 1.8.0
Issues Fixed in Cloudera Data Science Workbench 1.8.1
Issues Fixed in Cloudera Data Science Workbench 1.9.0
Issues Fixed in Cloudera Data Science Workbench 1.9.1
Issues Fixed in Cloudera Data Science Workbench 1.9.2
Java Requirements
JDK Requirements
Jobs
Jobs and Pipelines
Jupyter Magic Commands
Key Differences - CML vs. CDSW
Key Points to Note
Known Issues and Limitations
Known Issues and Limitations in Cloudera Data Science Workbench 1.0.x
Known Issues and Limitations in Cloudera Data Science Workbench 1.1.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.1.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.2.2
Known Issues and Limitations in Cloudera Data Science Workbench 1.3.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.3.1
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.2
Known Issues and Limitations in Cloudera Data Science Workbench 1.4.3
Known Issues and Limitations in Cloudera Data Science Workbench 1.5.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.6.0
Known Issues and Limitations in Cloudera Data Science Workbench 1.9.2
Labeling Auxiliary Hosts for CSD Deployments
Labeling Auxiliary Hosts for RPM Deployments
Launch a Session to Run the Project
Launch Workers
LDAP General Settings
LDAP Group Settings
LDAP Over SSL (LDAPS)
Limitations
Limitations
Limitations with Port Availability
Linking an Existing Project to a Git Remote
List Workers
Load Data into HBase Table
Loading CSV Data into an Impala Table
Log in to cdswctl
Managing CDSW
Managing Cloudera Data Science Workbench Hosts
Managing Cloudera Data Science Workbench Users
Managing Cloudera Data Science Workbench Worker Hosts
Managing Dependencies for Spark 2 Jobs
Managing Dependencies for Spark 2 Projects
Managing Engine Dependencies
Managing Engine Images
Managing Engines
Managing Files
Managing Hosts
Managing Jobs and Pipelines in Cloudera Data Science Workbench
Managing License Keys for Cloudera Data Science Workbench
Managing Memory Available for Spark Drivers
Managing ML Runtimes
Managing Model API Keys
Managing Projects in Cloudera Data Science Workbench
Managing Resource Profiles
Managing Resource Profiles
Managing Team Accounts
Managing the Cloudera Data Science Workbench Service in Cloudera Manager
Managing Users as a Site Administrator
Managing your Personal Account
Master Host
Migrating a CSD Deployment
Migrating a Deployment to a New Set of Hosts
Migrating a Deployment to New Hosts
Migrating an RPM Deployment
Migrating from an RPM-based Deployment to the Latest 1.9.2 CSD
ML Runtimes 2020.11
ML Runtimes 2020.11
ML Runtimes 2021.02
ML Runtimes 2021.02
ML Runtimes 2021.04
ML Runtimes 2021.04
ML Runtimes 2021.06
ML Runtimes 2021.06
ML Runtimes 2021.09
ML Runtimes 2021.09
ML Runtimes 2021.12
ML Runtimes 2021.12
ML Runtimes 2022.04
ML Runtimes 2022.04
ML Runtimes Environment Variables
ML Runtimes Environment Variables List
ML Runtimes Known Issues and Limitations
ML Runtimes Nvidia GPU Edition
ML Runtimes NVIDIA RAPIDS Edition
ML Runtimes Pre-installed Packages
ML Runtimes Release Notes
ML Runtimes Version 2020.04
ML Runtimes Version 2020.11
ML Runtimes Version 2021.02
ML Runtimes Version 2021.06
ML Runtimes Version 2021.09
ML Runtimes Version 2021.09.02
ML Runtimes versus Legacy Engines
ML Runtimes versus the Legacy Engine
ML Runtimes What's New
Model Artifacts
Model Code
Model Governance
Model Monitoring and Administration
Model Training and Deployment - Iris Dataset
Models
Models
Models
Models and Experiments
Models Demo
Modifying Default and Custom Quotas
Modifying Project Settings
Modifying Team Account Settings
Monitoring
Monitoring All Active Models
Monitoring Cloudera Data Science Workbench Usage
Monitoring Individual Models
Monitoring User Events
Monitoring Users
Monitoring/Reducing CDSW disk space
Mounting Additional Dependencies from the Host
Move Backup to New Master
Move Backup to the New Master
Multiple Cloudera Data Science Workbench Deployments
Network and Security Requirements
Networking
Networking and Security Requirements
New Features and Changes in Cloudera Data Science Workbench 1.1.0
New Features and Changes in Cloudera Data Science Workbench 1.2.0
New Features and Changes in Cloudera Data Science Workbench 1.2.2
New Features and Changes in Cloudera Data Science Workbench 1.3.0
New Features and Changes in Cloudera Data Science Workbench 1.4.2
New Features and Changes in Cloudera Data Science Workbench 1.4.3
New Features and Changes in Cloudera Data Science Workbench 1.5.0
New Features and Changes in Cloudera Data Science Workbench 1.6.0
New Features and Changes in Cloudera Data Science Workbench 1.6.1
New Features and Changes in Cloudera Data Science Workbench 1.7.1
New Features and Changes in Cloudera Data Science Workbench 1.7.2
New Features and Changes in Cloudera Data Science Workbench 1.8.0
New Features and Changes in Cloudera Data Science Workbench 1.8.1
New Features and Changes in Cloudera Data Science Workbench 1.9.0
New Features and Changes in Cloudera Data Science Workbench 1.9.1
New Features and Changes in Cloudera Data Science Workbench 1.9.2
New Features in Cloudera Data Science Workbench 1.1.1
New Features in Cloudera Data Science Workbench 1.3.1
New Features in Cloudera Data Science Workbench 1.4.0
Next Steps
Next Steps
Next Steps
Next Steps
Next Steps
Older Releases
Operating System Requirements
Operating System Requirements
Overview of Deploying CDSW1.9.2 on HDP Overview
Package Libraries
Parallel Computing
Personal Key
Platform
Platform Requirements
Ports Required by Cloudera Data Science Workbench
Ports Used By Cloudera Data Science Workbench
Pre-Installed Packages in ML Runtimes
Pre-Installed Packages in ML Runtimes
Prepare to manage models using the model CLI
Prerequisites
Prerequisites
Prerequisites
Prerequisites
Prerequisites
Private Key and Certificate Requirements
Product Overview
Project Admins: Enable the CUDA Engine for your Project
Project Code Files
Project Collaborators
Project Environments
Projects
Projects
Proxy Configuration
Purpose
Push the Image to a Public Registry such as DockerHub
Push the Image to Your Company's Docker Registry
Pushing
Python
Python
Python 2 Libraries in Base Engine 2
Python 3.6 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
Python 3.6 Libraries for JupyterLab
Python 3.6 Libraries for Workbench
Python 3.6 Libraries for Workbench
Python 3.6 Libraries for Workbench
Python 3.6 Libraries for Workbench
Python 3.6.12 Libraries for JupyterLab
Python 3.6.12 Libraries for JupyterLab
Python 3.6.12 Libraries for Workbench
Python 3.6.12 Libraries for Workbench
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7.9 Libraries for JupyterLab
Python 3.7.9 Libraries for JupyterLab
Python 3.7.9 Libraries for Workbench
Python 3.7.9 Libraries for Workbench
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8.6 Libraries for JupyterLab
Python 3.8.6 Libraries for JupyterLab
Python 3.8.6 Libraries for Workbench
Python 3.8.6 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python Libraries in Base Engine 10
Python Libraries in Base Engine 13
Python Libraries in Base Engine 3
Python Libraries in Base Engine 4
Python Libraries in Base Engine 5
Python Libraries in Base Engine 6
Python Libraries in Base Engine 7
Python Libraries in Base Engine 8
Python Supported Versions
Python Supported Versions
Query Data Using HappyBase
Quickstart
Quickstart Demo
Quotas
R
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.1 Libraries
R 4.1 Libraries
R 4.1 Libraries
R 4.1 Libraries
R Libraries in Base Engine 10
R Libraries in Base Engine 13
R Libraries in Base Engine 2
R Libraries in Base Engine 3
R Libraries in Base Engine 4
R Libraries in Base Engine 5
R Libraries in Base Engine 6
R Libraries in Base Engine 7
R Libraries in Base Engine 8
RAPIDS Runtime PIP Python 3.7.8 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.7.8 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.7.8 Libraries for Workbench
RAPIDS Runtime PIP Python 3.7.8 Libraries for Workbench
RAPIDS Runtime PIP Python 3.8.6 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.8.6 Libraries for JupyterLab
RAPIDS Runtime PIP Python 3.8.6 Libraries for Workbench
RAPIDS Runtime PIP Python 3.8.6 Libraries for Workbench
Re-deploy an Existing Build
Recommended Configuration on Amazon Web Services (AWS)
Recommended Configuration on Microsoft Azure
Recommended Hardware Configuration
Registering training data lineage using a linking file
Related Resources
Related Resources
Release Notes
Remove Cloudera Data Science Workbench from Existing Hosts
Removing a Worker Host Using Cloudera Manager
Removing a Worker Host Using Packages
Required Pre-Installation Steps
Requirements and Supported Platforms
Reserving the Master Host for CSD Deployments
Reserving the Master Host for Internal CDSW Components
Reserving the Master Host for RPM Deployments
Resouce Consumption and Scaling
Restart a Model
Restricting Collaborator and Administrator Access to Active Sessions
Restricting User-Controlled Kubernetes Pods
Roles Associated with the Cloudera Data Science Workbench Service
Rollback Cloudera Data Science Workbench to an Older Version
RPM Deployments
RPM Deployments
RPM Deployments
RPM Installation on CDH
Run Code
Run Experiment / Deploy Model
Run the Prepare Node command on the New Hosts
Running an Experiment (Quick Start)
Running Distributed ML Workloads on YARN
Running Queries on Impala Tables
Running Spark Jobs on an HDP Cluster
Runtimes
Runtimes
Sample Job Run
Saved Images
Saving Files
Scala
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala in Base Engine 10
Scala in Base Engine 2
Scala in Base Engine 3
Scala in Base Engine 4
Scala in Base Engine 5
Scala in Base Engine 6
Scala in Base Engine 7
Scala in Base Engine 8
Securing Models using Model API Key
Security
Security
Security
Security Considerations
Security Model
Set Up a Wildcard DNS Subdomain
Set Up the Operating System and Kernel
Setting Environmental Variables
Setting Up a PySpark Project
Setting Up an HTTP Proxy for Spark 2
Sharing Data Visualizations
Sharing Job and Session Console Outputs
Sharing Personal Projects
Sign up
Simple Plots
Site Administration
Site Admins: Add the Custom CUDA Engine to your Cloudera Data Science Workbench Deployment
Snapshot Code
Spark 2 Web UIs (CDSW_SPARK_PORT)
Spark Configuration
Spark Configuration Files
Spark Logging Configuration
Spark on ML Runtimes
SSH Keys
SSH Keys
Start a New Session
Start the CDSW Service
Starting a Job Run Using Python
Starting a Job Run Using the API
Starting, Stopping, and Restarting the Service
Stop a Model
Stop a Session
Stop Cloudera Data Science Workbench
Stop the CDSW Service
Stop Workers
Supporing a TLS-Enabled Proxy Server
Supported Browsers
Supported Platforms and Requirements
Team Key
Teams
TensorBoard, Shiny, and others (CDSW_APP_PORT or CDSW_READONLY_PORT)
Test a Browser IDE in a Session Before Installation
Test LDAP Configuration
Test the CUDA Engine
Test whether Cloudera Data Science Workbench can Detect GPUs
Testing Applications Before You Deploy
Testing ML Runtime GPU Setup
Testing ML Runtime GPU Setup
Third-party Editors
TLS and SSL Encryption
Tracked User Events
Tracking Disk Usage on the Application Block Device
Tracking Metrics
Tracking Model Metrics
Train the Model
Trial License
Troubleshooting
Troubleshooting Cloudera Data Science Workbench
Troubleshooting Issues with Models and Experiments
Troubleshooting Issues with Workloads
Troubleshooting Kerberos Errors
Troubleshooting TLS/SSL Errors
UI Behavior for Non-Kerberized Clusters
Understanding Installation Warnings
Uninstalling CPM Deployments
Uninstalling RPM Deployments
Update DNS Records for the New Master
Update DNS Records for the New Master
Updating Active Models
Upgrade
Upgrades
Upgrading a CDSW 1.9.2 Deployment from HDP 2 to HDP 3
Upgrading Cloudera Data Science Workbench 1.7.2 or higher from CDH 6 to CSP Private Cloud Base 7.x
Upgrading Cloudera Data Science Workbench 1.9.2 Using Cloudera Manager
Upgrading Cloudera Data Science Workbench Using Packages
Upgrading CSD Deployments from CDH 5 to CDH 6
Upgrading R and Python Packages
Upgrading RPM Deployments from CDH 5 to CDH 6
Upgrading to Cloudera Data Science Workbench 1.9.2 on HDP
Upgrading to the Latest Version of Cloudera Data Science Workbench on CDH
Uploading License Keys
Usability
Usage Guidelines
Usage Tracking
User Access to Features
User Contexts
User Guide
User Role Authorization
User Roles
User Sign up Process
User-Controlled Kubernetes Pod
Using CDS 2.x Powered by Apache Spark
Using Cloudera Manager
Using Conda with Cloudera Data Science Workbench
Using Editors for ML Runtimes
Using Git to Collaborate on Projects
Using GPUs with Legacy Engines-Technical Preview
Using Jupyter with ML Runtimes
Using NVIDIA GPUs for Cloudera Data Science Workbench Projects
Using Spark 2 from Python
Using Spark 2 from R
Using Spark 2 from Scala
Using the Command Line
Using the Workbench
Videos
View replica logs for a model using the CLI
Viewing Job History
Viewing Licensed Users
Viewing lineage for a model deployment in Atlas
Visualize Report
Web Applications Embedded in Cloudera Data Science Workbench
Web session timeouts
Web UI
What are the software and hardware requirements for Cloudera Data Science Workbench?
What's New
Where can I get a sample project to try out Cloudera Data Science Workbench?
Whitelist the Image in Cloudera Data Science Workbench
Wildcard DNS Subdomain Requirement
Wire Encryption
Workbench
Worker Hosts
Workers API
«
Filter topics
Stop a Session
▼
Using the Workbench
Start a New Session
▶︎
Run Code
Code Autocomplete
Project Code Files
Access the Terminal
Stop a Session
▶︎
Jupyter Magic Commands
Python
Scala
»
Workbench
Stop a Session
When you are done with a session, you can stop it.
Click
Stop
in the menu bar above the console.
Alternatively you can stop a session by typing the following command:
R
quit()
Python
exit
Scala
quit()
Sessions automatically stop after an hour of inactivity.
Parent topic:
Using the Workbench
1.10
1.10.5
1.10.4
1.10.3
1.10.2
1.10.1
1.10.0
1.9
1.9.2
1.9.1
1.9.0
1.8
1.7
1.6
1.5
1.4
1.3
1.2
1.1
1.0
This site uses cookies and related technologies, as described in our
privacy policy
, for purposes that may include site operation, analytics, enhanced user experience, or advertising. You may choose to consent to our use of these technologies, or
manage your own preferences.
Accept all