Homepage
/
Cloudera AI
1.5.4
(latest)
Search Documentation
▶︎
Cloudera
Reference Architectures
▶︎
Cloudera on cloud
Getting Started
Patterns
Preview Features
Data Catalog
Data Engineering
Data Flow
Data Hub
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Management Console
Operational Database
Replication Manager
DataFlow for Data Hub
Runtime
▼
Cloudera on premises
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime & Cloudera Manager
Upgrade
Storage
Flow Management
Streaming Analytics
CFM Operator
CSA Operator
CSM Operator
▶︎
Cloudera Manager
Cloudera Manager
▶︎
Applications
Cloudera Streaming Community Edition
Data Science Workbench
Data Visualization
Edge Management
Observability SaaS
Observability on premises
Workload XM On-Prem
▶︎
Legacy
Cloudera Enterprise
Flow Management
Stream Processing
HDP
HDF
Streams Messaging Manager
Streams Replication Manager
▶︎
Data Services
Getting Started
Cloudera Manager
Management Console
Replication Manager
Data Catalog
Data Engineering
Data Warehouse
Data Warehouse Runtime
Cloudera AI
Base
Getting Started
Runtime & Cloudera Manager
Upgrade
Storage
Flow Management
Streaming Analytics
CFM Operator
CSA Operator
CSM Operator
«
Filter topics
▶︎
All roles
▶︎
Top Tasks
Creating a Project
Analytical Applications
Creating and deploying a Model
▶︎
Release Notes
What's New
Fixed Issues
Known Issues
▶︎
Cumulative Hotfixes: Cloudera AI on premises
Cloudera AI on premises 1.5.4 CHF1
Cloudera AI on premises 1.5.4 CHF2
Cloudera AI on premises 1.5.4 CHF3
▶︎
Cloudera AI on premises 1.5.4 SP2 CHF1
Fixed issues in 1.5.4 SP2 CHF1
Repository Locations for 1.5.4 SP2 CHF1
▶︎
Service Packs: Cloudera AI on premises
Cloudera AI on premises 1.5.4 SP1
Cloudera AI on premises 1.5.4 SP2
▶︎
ML Runtimes Release Notes
▶︎
ML Runtimes What's New
What's New in ML Runtimes version 2024.10.1
What's New in ML Runtimes version 2024.05.2
What's New in ML Runtimes version 2024.05.1
▶︎
What's New in ML Runtimes older releases
What's New in ML Runtimes version 2024.02.1
ML Runtimes Version 2023.12.1
ML Runtimes Version 2023.08.2
ML Runtimes Version 2023.08.1
ML Runtimes Version 2023.05.2
ML Runtimes Version 2023.05.1
ML Runtimes Version 2022.11.2
ML Runtimes Version 2022.11
ML Runtimes Version 2022.04
ML Runtimes Version 2021.12
ML Runtimes Version 2021.09.02
ML Runtimes Version 2021.09
ML Runtimes Version 2021.06
ML Runtimes Version 2021.04
ML Runtimes Version 2021.02
ML Runtimes Version 2020.11
▶︎
ML Runtimes Known Issues and Limitations
Known Issues and Limitations in ML Runtimes version 2024.10.01
Known Issues and Limitations in ML Runtimes version 2024.05.02
Known Issues and Limitations in ML Runtimes version 2024.05.01
Known Issues and Limitations in ML Runtimes older releases
▶︎
ML Runtimes Pre-installed Packages
ML Runtimes Pre-installed Packages overview
▶︎
ML Runtimes 2024.10.1
Python 3.10 Libraries for Conda
Python 3.12 Libraries for Workbench
Python 3.11 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.12 Libraries for JupyterLab
Python 3.11 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.4 Libraries
▶︎
ML Runtimes 2024.05.2
Python 3.10 Libraries for Conda
Python 3.11 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.11 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.4 Libraries
▶︎
ML Runtimes 2024.05.1
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.11 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.4 Libraries
▶︎
ML Runtimes 2024.02.1
Python 3.10 Libraries for Conda
Python 3.11 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.11 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.3 Libraries
▶︎
ML Runtimes 2023.12.1
Python 3.10 Libraries for Conda
Python 3.11 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.11 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.3 Libraries
▶︎
ML Runtimes 2023.08.2
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.10 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.3 Libraries
R 4.1 Libraries
R 4.0 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2022.11
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.1 Libraries
R 4.0 Libraries
R 3.6 Libraries
Python 3.9 Libraries for PBJ Workbench
Python 3.8 Libraries for PBJ Workbench
Python 3.7 Libraries for PBJ Workbench
PBJ R 4.1 Libraries
PBJ R 4.0 Libraries
PBJ R 3.6 Libraries
▶︎
ML Runtimes 2022.04
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
▶︎
ML Runtimes 2021.12
Python 3.9 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.7 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
R 4.0 Libraries
R 4.1 Libraries
R 3.6 Libraries
ML Runtimes 2020.11
▶︎
About Cloudera AI
Cloudera AI Overview
▶︎
Requirements
Introduction to Cloudera on premises
Requirements for Cloudera AI on OpenShift Container Platform
Requirements for Cloudera AI on Cloudera Embedded Container Service
Certified Scale Limitations for Cloudera AI Workbenches on Cloudera Embedded Container Service
Getting Started with Cloudera AI on premises
Test Your Connectivity to the Cloudera-Data Center Cluster
Differences Between Cloudera on cloud and Cloudera on premises
Limitations on Cloudera on premises
▶︎
Network File System (NFS)
NFS Options for on premises
Internal Network File System on OpenShift Container Platform
Internal Network File System on Cloudera Embedded Container Service
Using an External NFS Server
Deploying a Cloudera AI Workbench with Support for TLS
Replace a Certificate
Setting up the GPU node
▼
Data Scientist
▼
Starting a Project
▶︎
Collaboration models
Collaboration Models
Sharing Job and Session Console Outputs
▶︎
User Roles and Team Accounts
User Roles
Business Users and Cloudera AI
Managing your Personal Account
Creating a Team
Managing a Team Account
▼
Managing Projects
▶︎
Managing Projects
Creating a Project
Creating a project from a password-protected Git repo
Configuring Project-level Runtimes
Adding Project Collaborators
Modifying Project settings
Managing Project Files
Deleting a Project
▶︎
Native Workbench Console and Editor
Launch a Session
Run Code
Access the Terminal
Stop a Session
Workbench editor file types
Environmental Variables
▼
Third-party Editors
Modes of configuration
▶︎
Configure a browser-based IDE as an Editor
Testing a browser-based IDE in a Session
Configuring a browser-based IDE at the Project level
Legacy Engine level configuration
Configuring a local IDE using an SSH gateway
▶︎
Configure PyCharm as a local IDE
Add Cloudera AI as an Interpreter for PyCharm
Configure PyCharm to use Cloudera AI as the remote console
(Optional) Configure the Sync between Cloudera AI and PyCharm
▼
Configure VS Code as a local IDE
Download cdswctl and add an SSH Key
Initialize an SSH connection to Cloudera AI for VS code
Setting up VS Code
(Optional) Using VS Code with Python
(Optional) Using VS Code with R
(Optional) Using VS Code with Jupyter
(Optional) Using VS Code with Git integration
Limiting files in Explorer view
▶︎
Git for Collaboration
Linking an existing Project to a Git remote
▶︎
Embedded Web Applications
Example: A Shiny Application
Example: Flask application
▶︎
Applied ML Prototypes (AMP)
Cloudera Accelerators for Machine Learning Projects
HuggingFace Spaces and Community AMPs
Creating New AMPs
Custom AMP Catalog
Add a catalog
Catalog File Specification
AMP Project Specification
Restarting a failed AMP setup
Host names required by AMPs
AMPs in airgapped environments
▶︎
Discovering and Exploring Data
▶︎
Exploratory Data Science and Visualization
Exploratory Data Science and Visualization
Prerequisites for Cloudera AI Discovery and Exploration
▶︎
Starting Data Discovery and Visualization
Workarounds for Cloudera Data Visualization with Hive and Impala
Working with Data Discovery and Visualization
Data connection management
Managing default and backup data connections
API permissions for Projects
Troubleshooting: 401 Unauthorized
Troubleshooting: 401 Unauthorized when accessing Hive or Impala virtual warehouses
Troubleshooting: Existing connection name
Troubleshooting: Empty data page
▶︎
Data Access
Uploading and working with local files
Auto discovering data sources
Using data connection snippets
▶︎
Manual data connections to connect to data sources
Connecting to Cloudera Data Warehouse
Setting up a Hive or Impala data connection manually
Connecting to Hive and Impala services on Cloudera on premises Base
Setting up a Spark data connection
▶︎
Accessing data with Spark
Using JDBC Connection with PySpark
Connecting to Iceberg tables
Connecting to Hive tables via HWC
Connecting to Ozone filesystem
▶︎
Accessing Ozone storage
Creating an Ozone data connection
Connecting to Ozone filesystem
Accessing local files in Ozone
Connecting to external Amazon S3 buckets
Connect to external SQL databases
▶︎
Visualizations
▶︎
Built-in Cloudera AI Visualizations
Simple Plots
Saved Images
HTML Visualizations
IFrame Visualizations
Grid Displays
Documenting Your Analysis
Cloudera Data Visualization for Cloudera AI
▶︎
Editors
▶︎
Built-in Cloudera AI Visualizations
Simple Plots
Saved Images
HTML Visualizations
IFrame Visualizations
Grid Displays
Documenting Your Analysis
Cloudera Data Visualization for Cloudera AI
▶︎
Developing for Data Science, ML and AI
▶︎
Runtimes
▶︎
Managing ML Runtimes
Adding new ML Runtimes
Updating ML Runtime images on Cloudera AI installations
Adding Custom ML Runtimes through the Runtime Catalog
Adding ML Runtimes using Runtime Repo files
ML Runtimes versus Legacy Engine
Using Runtime Catalog
Disabling and Deleting Runtimes
▶︎
PBJ Workbench
Dockerfile compatible with PBJ Workbench
PBJ Runtimes and Models
Example models with PBJ Runtimes
▶︎
Using ML Runtimes addons
Adding Hadoop CLI to ML Runtime sessions
Adding Spark to ML Runtime Sessions
Turning off ML Runtimes Addons
▶︎
ML Runtimes NVIDIA GPU Edition
Testing ML Runtime GPU Setup
ML Runtimes NVIDIA RAPIDS Edition
▶︎
Using Editors for ML Runtimes
▶︎
Using JupyterLab with ML Runtimes
Installing a Jupyter extension
Installing a Jupyter kernel
Using Conda Runtime
Installing additional ML Runtimes Packages
Restrictions for upgrading R and Python packages
Custom Runtime addons with Cloudera AI
▶︎
ML Runtimes environment variables
ML Runtimes Environment Variables List
Accessing Environmental Variables from projects
▶︎
Customized Runtimes
▶︎
Creating customized ML Runtimes
Creating a Dockerfile for the custom Runtime Image
Metadata for custom ML Runtimes
Customizing the editor
Building the new Docker Image
Distributing the ML Runtime Image
Adding Docker registry credentials and certificates
Adding the new ML Runtime
Limitations to customized ML Runtime images
ML Runtimes Pre-installed Packages overview
▶︎
Legacy Engines
▶︎
Managing Engines
Creating Resource profiles
Configuring the engine environment
Set up a custom repository location
Burstable CPUs
▶︎
Installing additional packages
Using Conda to manage dependencies
▶︎
Engine environment variables
Engine environment variables
Accessing environmental variables from projects
▶︎
Customized engine images
▶︎
Creating a customized engine image
Create a Dockerfile for the custom image
Build the new Docker image
Distribute the image
Including images in allowlist for Cloudera AI projects
Add Docker registry credentials
Limitations with customized engines
End-to-end example: MeCab
▶︎
Pre-Installed Packages in engines
Base Engine 15-cml-2021.09-1
Base Engine 14-cml-2021.05-1
Base Engine 13-cml-2020.08-1
Base Engine 12-cml-2020.06-2
Base Engine 11-cml1.4
Base Engine 10-cml1.3
Base Engine 9-cml1.2
▶︎
GPUs
Using GPUs for Cloudera AI projects
Heterogeneous GPU clusters
Testing GPU Setup
▶︎
Distributed Computing
▶︎
Distributed Computing with Workers
Workers API
Worker network communication
▶︎
Spark
Spark on Cloudera AI
Apache Spark supported versions
Spark configuration files
Managing memory available for Spark drivers
Managing dependencies for Spark 2 jobs
Spark Log4j configuration
Setting up an HTTP Proxy for Spark 2
Spark web UIs
▶︎
Using Spark 2 from Python
Example: Monte Carlo estimation
Example: Locating and adding JARs to Spark 2 configuration
Using Spark 3 from R
▶︎
Using Spark 2 from Scala
Managing dependencies for Spark 2 and Scala
Running Spark with Yarn on the Cloudera base cluster
▶︎
Experiments
Experiments with MLflow
Cloudera AI Experiment Tracking through MLflow API
Running an Experiment using MLflow
Visualizing Experiment Results
Using an MLflow Model Artifact in a Model REST API
Deploying an MLflow model as a Cloudera AI Model REST API
Automatic Logging
Setting Permissions for an Experiment
Known issues and limitations
MLflow transformers
▶︎
Packaging and Deployment
▶︎
Models
Cloudera AI Project Lifecycle
▶︎
Managing Models
Models - Concepts and Terminology
▶︎
Challenges with Machine Learning in production
Challenges with model deployment and serving
Challenges with model monitoring
▶︎
Challenges with model governance
Model visibility
Model explainability, interpretability, and reproducibility
Model governance using Apache Atlas
▶︎
Using Cloudera AI Registry
▶︎
Setting up Cloudera AI Registry
Prerequisites for creating Cloudera AI Registry
Creating a Cloudera AI Registry
Synchronizing Cloudera AI Registry with a workbench
Viewing details for Cloudera AI Registries
Cloudera AI Registry permissions
Model access control
▶︎
Deleting Cloudera AI Registry
Force delete a Cloudera AI Registry
▶︎
Registering and deploying models with Cloudera AI Registry
Creating a model using MLflow
Registering a model using the Cloudera AI Registry user interface
▶︎
Registering a model using MLflow SDK
Using MLflow SDK to register customized models
Creating a new version of a registered model
▶︎
Deploying a model from the AI Registry page
Deploying a model from the Cloudera AI Registry using APIv2
Deploying a model from the destination Project page
Viewing Details for Cloudera AI Registry
Delete a model from Cloudera AI Registry
Disabling Cloudera AI Registry
▶︎
Creating and deploying a Model
Hosting an LLM as a Cloudera AI Workbench model
Deploying the Cloudera AI Workbench model
Usage guidelines for deploying models with Cloudera AI
Known Issues and Limitations with Model Builds and Deployed Models
Request/Response Formats (JSON)
Testing calls to a Model
▶︎
Securing Models
Access Keys for Models
▶︎
API Key for Models
Enabling authentication
Generating an API key
Managing API Keys
Workflows for active Models
Technical metrics for Models
Debugging issues with Models
Deleting a Model
▶︎
Example - Model training and deployment (Iris)
Training the Model
Deploying the Model
▶︎
Serving AI Models
▶︎
Securing Models
▶︎
Securing Models
Access Keys for Models
▶︎
API Key for Models
Enabling authentication
Generating an API key
Managing API Keys
▶︎
Performing ML Operations
▶︎
Model Governance
Enabling model governance
Model Governance Requirements
Registering training data lineage using a linking file
Viewing lineage for a model deployment in Atlas
▶︎
Model Metrics
Enabling model metrics
Tracking model metrics without deploying a model
Tracking metrics for deployed models
▶︎
Deploying Applications
▶︎
Applications
Analytical Applications
Securing Applications
Limitations with Analytical Applications
Monitoring applications
▶︎
Jobs and Pipelines
Creating a Job
Creating a Pipeline
Viewing Job History
Jobs API
▶︎
Reference
▶︎
API
Cloudera AI API v2
API v2 usage
REST API v2 Reference
▶︎
CLI
Command Line Tools in Cloudera AI
▶︎
cdswctl Command Line Interface client
Downloading and configuring cdswctl
Initializing an SSH endpoint
Logging into cdswctl
Preparing to manage models for using the model CLI
Creating a model using the model CLI
Build and deployment commands for models
Deploying a new model with updated resources
Viewing replica logs for a model
▶︎
Using ML Runtimes with cdswctl
Querying the engine type
Listing ML Runtimes
Starting sessions and creating SSH endpoints
Creating a model
cdswctl command reference
▶︎
Jupyter Magics
▶︎
Jupyter Magic Commands
Python
Scala
▶︎
Troubleshooting for the Data Scientist
▶︎
Troubleshooting issues with workloads
Troubleshooting Kerberos issues
Troubleshooting for ML Runtimes
Troubleshooting custom Runtime addons
▶︎
Administrator
▶︎
Managing your Cloudera AI Workbench service
▶︎
Cloudera AI Workbenches
Provisioning a Cloudera AI Workbench
Monitoring Cloudera AI Workbenches
Removing Cloudera AI Workbenches
Upgrading Cloudera AI Workbenches
▶︎
Backing up Cloudera AI Workbenches
Workbench backup and restore prerequisites
Backing up a Cloudera AI Workbench
Restore a Cloudera AI Workbench
▶︎
GPU usage
Configuring GPU usage
▶︎
Site Administration
Managing Users
▶︎
Service Accounts
Creating a machine user and synchronizing to workbench
Synchronizing machine users from the Synced team
Running workloads using a service account
Authenticating Hadoop for Cloudera AI service accounts
Configuring Quotas
▶︎
Non-user dependent Resource Usage Limiting for workloads
Setting Resource Usage Limiting for workloads
Creating Resource profiles
Disable or deprecate Runtime addons
Onboarding Business Users
Adding a Collaborator
▶︎
User Roles
Business Users and Cloudera AI
Managing your Personal Account
Creating a Team
Managing a Team Account
▶︎
Monitoring User Activity
Tracked User events
Monitoring User Events
Monitoring active Models
Monitoring and alerts
Application polling endpoint
Choosing default engine
Controlling User access to features
Cloudera AI email notifications
Web session timeouts
Project garbage collection
How to make base cluster configuration changes
Ephemeral storage
▶︎
NTP proxy setup on Cloudera AI
Updating proxy configuration in an existing workbench
Export Usage List
Disable addons
Host name required by Learning Hub
▶︎
Security
▶︎
Configuring HTTP Headers for Cloudera AI
Enable HTTP security headers
Enable HTTP Strict Transport Security (HSTS)
Enable Cross-Origin Resource Sharing (CORS)
▶︎
SSH Keys
Personal key
Team key
Adding an SSH key to GitHub
Creating an SSH tunnel
Hadoop authentication for Cloudera AI Workbenches
Cloudera AI and outbound network access
▶︎
Quota Management [Technical Preview]
Quota Management overview
Creating a resource pool for Cloudera AI
Enabling Quota Management in Cloudera AI
Provisioning a workbench using a configured resource pool
Quota for Cloudera AI workloads
Resource Usage Dashboard
Limitations for Quota Management
Yunikorn Gang scheduling
▶︎
Cloudera AI Metrics Collector Service
Cloudera AI Metrics Collector Service overview
▶︎
Visualization in Cloudera AI quota management
Cloudera AI Quota Management dashboard
Troubleshooting for Cloudera AI Metrics Collector Service
▶︎
Supporting your Cloudera AI service
▶︎
Troubleshooting for the Administrator
Recommended troubleshooting workflow
Downloading diagnostic bundles for a workbench
Handling project volume size increase in Cloudera AI
(Optional) Configure the Sync between Cloudera AI and PyCharm
(Optional) Using VS Code with Git integration
(Optional) Using VS Code with Jupyter
(Optional) Using VS Code with Python
(Optional) Using VS Code with R
About Cloudera AI
Access Keys for Models
Access Keys for Models
Access the Terminal
Accessing data with Spark
Accessing Environmental Variables from projects
Accessing environmental variables from projects
Accessing local files in Ozone
Accessing Ozone storage
Add a catalog
Add Cloudera AI as an Interpreter for PyCharm
Add Docker registry credentials
Adding a Collaborator
Adding an SSH key to GitHub
Adding Custom ML Runtimes through the Runtime Catalog
Adding Docker registry credentials and certificates
Adding Hadoop CLI to ML Runtime sessions
Adding ML Runtimes using Runtime Repo files
Adding new ML Runtimes
Adding Project Collaborators
Adding Spark to ML Runtime Sessions
Adding the new ML Runtime
Administrator
All roles
AMP Project Specification
AMPs in airgapped environments
Analytical Applications
Analytical Applications
Apache Spark supported versions
API
API Key for Models
API Key for Models
API permissions for Projects
API v2 usage
Application polling endpoint
Applications
Applied ML Prototypes (AMP)
Authenticating Hadoop for Cloudera AI service accounts
Auto discovering data sources
Automatic Logging
Backing up a Cloudera AI Workbench
Backing up Cloudera AI Workbenches
Base Engine 10-cml1.3
Base Engine 11-cml1.4
Base Engine 12-cml-2020.06-2
Base Engine 13-cml-2020.08-1
Base Engine 14-cml-2021.05-1
Base Engine 15-cml-2021.09-1
Base Engine 9-cml1.2
Build and deployment commands for models
Build the new Docker image
Building the new Docker Image
Built-in Cloudera AI Visualizations
Built-in Cloudera AI Visualizations
Burstable CPUs
Business Users and Cloudera AI
Business Users and Cloudera AI
Catalog File Specification
cdswctl Command Line Interface client
cdswctl command reference
Certified Scale Limitations for Cloudera AI Workbenches on Cloudera Embedded Container Service
Challenges with Machine Learning in production
Challenges with model deployment and serving
Challenges with model governance
Challenges with model monitoring
Choosing default engine
CLI
Cloudera Accelerators for Machine Learning Projects
Cloudera AI and outbound network access
Cloudera AI API v2
Cloudera AI email notifications
Cloudera AI Experiment Tracking through MLflow API
Cloudera AI Metrics Collector Service
Cloudera AI Metrics Collector Service overview
Cloudera AI on premises 1.5.4 CHF1
Cloudera AI on premises 1.5.4 CHF2
Cloudera AI on premises 1.5.4 CHF3
Cloudera AI on premises 1.5.4 SP1
Cloudera AI on premises 1.5.4 SP2
Cloudera AI on premises 1.5.4 SP2 CHF1
Cloudera AI Overview
Cloudera AI Project Lifecycle
Cloudera AI Quota Management dashboard
Cloudera AI Registry permissions
Cloudera AI Workbenches
Cloudera Data Visualization for Cloudera AI
Cloudera Data Visualization for Cloudera AI
Collaboration models
Collaboration Models
Command Line Tools in Cloudera AI
Configure a browser-based IDE as an Editor
Configure PyCharm as a local IDE
Configure PyCharm to use Cloudera AI as the remote console
Configure VS Code as a local IDE
Configuring a browser-based IDE at the Project level
Configuring a local IDE using an SSH gateway
Configuring GPU usage
Configuring HTTP Headers for Cloudera AI
Configuring Project-level Runtimes
Configuring Quotas
Configuring the engine environment
Connect to external SQL databases
Connecting to Cloudera Data Warehouse
Connecting to external Amazon S3 buckets
Connecting to Hive and Impala services on Cloudera on premises Base
Connecting to Hive tables via HWC
Connecting to Iceberg tables
Connecting to Ozone filesystem
Connecting to Ozone filesystem
Controlling User access to features
Create a Dockerfile for the custom image
Creating a Cloudera AI Registry
Creating a customized engine image
Creating a Dockerfile for the custom Runtime Image
Creating a Job
Creating a machine user and synchronizing to workbench
Creating a model
Creating a model using MLflow
Creating a model using the model CLI
Creating a new version of a registered model
Creating a Pipeline
Creating a Project
Creating a Project
Creating a project from a password-protected Git repo
Creating a resource pool for Cloudera AI
Creating a Team
Creating a Team
Creating an Ozone data connection
Creating an SSH tunnel
Creating and deploying a Model
Creating and deploying a Model
Creating customized ML Runtimes
Creating New AMPs
Creating Resource profiles
Creating Resource profiles
Cumulative Hotfixes: Cloudera AI on premises
Custom AMP Catalog
Custom Runtime addons with Cloudera AI
Customized engine images
Customized Runtimes
Customizing the editor
Data Access
Data connection management
Data Scientist
Debugging issues with Models
Delete a model from Cloudera AI Registry
Deleting a Model
Deleting a Project
Deleting Cloudera AI Registry
Deploying a Cloudera AI Workbench with Support for TLS
Deploying a model from the AI Registry page
Deploying a model from the Cloudera AI Registry using APIv2
Deploying a model from the destination Project page
Deploying a new model with updated resources
Deploying an MLflow model as a Cloudera AI Model REST API
Deploying the Cloudera AI Workbench model
Deploying the Model
Differences Between Cloudera on cloud and Cloudera on premises
Disable addons
Disable or deprecate Runtime addons
Disabling and Deleting Runtimes
Disabling Cloudera AI Registry
Distribute the image
Distributed Computing
Distributed Computing with Workers
Distributing the ML Runtime Image
Dockerfile compatible with PBJ Workbench
Documenting Your Analysis
Documenting Your Analysis
Download cdswctl and add an SSH Key
Downloading and configuring cdswctl
Downloading diagnostic bundles for a workbench
Editors
Embedded Web Applications
Enable Cross-Origin Resource Sharing (CORS)
Enable HTTP security headers
Enable HTTP Strict Transport Security (HSTS)
Enabling authentication
Enabling authentication
Enabling model governance
Enabling model metrics
Enabling Quota Management in Cloudera AI
End-to-end example: MeCab
Engine environment variables
Engine environment variables
Environmental Variables
Ephemeral storage
Example - Model training and deployment (Iris)
Example models with PBJ Runtimes
Example: A Shiny Application
Example: Flask application
Example: Locating and adding JARs to Spark 2 configuration
Example: Monte Carlo estimation
Experiments
Experiments with MLflow
Exploratory Data Science and Visualization
Exploratory Data Science and Visualization
Export Usage List
Fixed Issues
Fixed issues in 1.5.4 SP2 CHF1
Force delete a Cloudera AI Registry
Generating an API key
Generating an API key
Getting Started with Cloudera AI on premises
Git for Collaboration
GPU usage
GPUs
Grid Displays
Grid Displays
Hadoop authentication for Cloudera AI Workbenches
Handling project volume size increase in Cloudera AI
Heterogeneous GPU clusters
Host name required by Learning Hub
Host names required by AMPs
Hosting an LLM as a Cloudera AI Workbench model
How to make base cluster configuration changes
HTML Visualizations
HTML Visualizations
HuggingFace Spaces and Community AMPs
IFrame Visualizations
IFrame Visualizations
Including images in allowlist for Cloudera AI projects
Initialize an SSH connection to Cloudera AI for VS code
Initializing an SSH endpoint
Installing a Jupyter extension
Installing a Jupyter kernel
Installing additional ML Runtimes Packages
Installing additional packages
Internal Network File System on Cloudera Embedded Container Service
Internal Network File System on OpenShift Container Platform
Introduction to Cloudera on premises
Jobs and Pipelines
Jobs API
Jupyter Magic Commands
Jupyter Magics
Known Issues
Known issues and limitations
Known Issues and Limitations in ML Runtimes older releases
Known Issues and Limitations in ML Runtimes version 2024.05.01
Known Issues and Limitations in ML Runtimes version 2024.05.02
Known Issues and Limitations in ML Runtimes version 2024.10.01
Known Issues and Limitations with Model Builds and Deployed Models
Launch a Session
Legacy Engine level configuration
Legacy Engines
Limitations for Quota Management
Limitations on Cloudera on premises
Limitations to customized ML Runtime images
Limitations with Analytical Applications
Limitations with customized engines
Limiting files in Explorer view
Linking an existing Project to a Git remote
Listing ML Runtimes
Logging into cdswctl
Managing a Team Account
Managing a Team Account
Managing API Keys
Managing API Keys
Managing default and backup data connections
Managing dependencies for Spark 2 and Scala
Managing dependencies for Spark 2 jobs
Managing Engines
Managing memory available for Spark drivers
Managing ML Runtimes
Managing Models
Managing Project Files
Managing Projects
Managing Projects
Managing Users
Managing your Personal Account
Managing your Personal Account
Manual data connections to connect to data sources
Metadata for custom ML Runtimes
ML Runtimes 2020.11
ML Runtimes 2021.12
ML Runtimes 2022.04
ML Runtimes 2022.11
ML Runtimes 2023.08.2
ML Runtimes 2023.12.1
ML Runtimes 2024.02.1
ML Runtimes 2024.05.1
ML Runtimes 2024.05.2
ML Runtimes 2024.10.1
ML Runtimes environment variables
ML Runtimes Environment Variables List
ML Runtimes Known Issues and Limitations
ML Runtimes NVIDIA GPU Edition
ML Runtimes NVIDIA RAPIDS Edition
ML Runtimes Pre-installed Packages
ML Runtimes Pre-installed Packages overview
ML Runtimes Pre-installed Packages overview
ML Runtimes Release Notes
ML Runtimes Version 2020.11
ML Runtimes Version 2021.02
ML Runtimes Version 2021.04
ML Runtimes Version 2021.06
ML Runtimes Version 2021.09
ML Runtimes Version 2021.09.02
ML Runtimes Version 2021.12
ML Runtimes Version 2022.04
ML Runtimes Version 2022.11
ML Runtimes Version 2022.11.2
ML Runtimes Version 2023.05.1
ML Runtimes Version 2023.05.2
ML Runtimes Version 2023.08.1
ML Runtimes Version 2023.08.2
ML Runtimes Version 2023.12.1
ML Runtimes versus Legacy Engine
ML Runtimes What's New
MLflow transformers
Model access control
Model explainability, interpretability, and reproducibility
Model Governance
Model Governance Requirements
Model governance using Apache Atlas
Model Metrics
Model visibility
Models
Models - Concepts and Terminology
Modes of configuration
Modifying Project settings
Monitoring active Models
Monitoring and alerts
Monitoring applications
Monitoring Cloudera AI Workbenches
Monitoring User Activity
Monitoring User Events
Native Workbench Console and Editor
Network File System (NFS)
NFS Options for on premises
Non-user dependent Resource Usage Limiting for workloads
NTP proxy setup on Cloudera AI
Onboarding Business Users
PBJ R 3.6 Libraries
PBJ R 4.0 Libraries
PBJ R 4.1 Libraries
PBJ Runtimes and Models
PBJ Workbench
Personal key
Pre-Installed Packages in engines
Preparing to manage models for using the model CLI
Prerequisites for Cloudera AI Discovery and Exploration
Prerequisites for creating Cloudera AI Registry
Project garbage collection
Provisioning a Cloudera AI Workbench
Provisioning a workbench using a configured resource pool
Python
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Conda
Python 3.10 Libraries for Conda
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for JupyterLab
Python 3.10 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.10 Libraries for Workbench
Python 3.11 Libraries for JupyterLab
Python 3.11 Libraries for JupyterLab
Python 3.11 Libraries for JupyterLab
Python 3.11 Libraries for JupyterLab
Python 3.11 Libraries for JupyterLab
Python 3.11 Libraries for Workbench
Python 3.11 Libraries for Workbench
Python 3.11 Libraries for Workbench
Python 3.11 Libraries for Workbench
Python 3.12 Libraries for JupyterLab
Python 3.12 Libraries for Workbench
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for JupyterLab
Python 3.7 Libraries for PBJ Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.7 Libraries for Workbench
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for JupyterLab
Python 3.8 Libraries for PBJ Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.8 Libraries for Workbench
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for JupyterLab
Python 3.9 Libraries for PBJ Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9 Libraries for Workbench
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Python 3.9.6 Libraries for JupyterLab
Querying the engine type
Quota for Cloudera AI workloads
Quota Management overview
Quota Management [Technical Preview]
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 3.6 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.0 Libraries
R 4.1 Libraries
R 4.1 Libraries
R 4.1 Libraries
R 4.1 Libraries
R 4.3 Libraries
R 4.3 Libraries
R 4.3 Libraries
R 4.4 Libraries
R 4.4 Libraries
R 4.4 Libraries
Recommended troubleshooting workflow
Registering a model using MLflow SDK
Registering a model using the Cloudera AI Registry user interface
Registering and deploying models with Cloudera AI Registry
Registering training data lineage using a linking file
Release Notes
Removing Cloudera AI Workbenches
Replace a Certificate
Repository Locations for 1.5.4 SP2 CHF1
Request/Response Formats (JSON)
Requirements
Requirements for Cloudera AI on Cloudera Embedded Container Service
Requirements for Cloudera AI on OpenShift Container Platform
Resource Usage Dashboard
Restarting a failed AMP setup
Restore a Cloudera AI Workbench
Restrictions for upgrading R and Python packages
Run Code
Running an Experiment using MLflow
Running Spark with Yarn on the Cloudera base cluster
Running workloads using a service account
Runtimes
Saved Images
Saved Images
Scala
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Scala 2.11 Libraries for Workbench
Securing Applications
Securing Models
Securing Models
Securing Models
Security
Service Accounts
Service Packs: Cloudera AI on premises
Set up a custom repository location
Setting Permissions for an Experiment
Setting Resource Usage Limiting for workloads
Setting up a Hive or Impala data connection manually
Setting up a Spark data connection
Setting up an HTTP Proxy for Spark 2
Setting up Cloudera AI Registry
Setting up the GPU node
Setting up VS Code
Sharing Job and Session Console Outputs
Simple Plots
Simple Plots
Site Administration
Spark
Spark configuration files
Spark Log4j configuration
Spark on Cloudera AI
Spark web UIs
SSH Keys
Starting Data Discovery and Visualization
Starting sessions and creating SSH endpoints
Stop a Session
Synchronizing Cloudera AI Registry with a workbench
Synchronizing machine users from the Synced team
Team key
Technical metrics for Models
Test Your Connectivity to the Cloudera-Data Center Cluster
Testing a browser-based IDE in a Session
Testing calls to a Model
Testing GPU Setup
Testing ML Runtime GPU Setup
Third-party Editors
Top Tasks
Tracked User events
Tracking metrics for deployed models
Tracking model metrics without deploying a model
Training the Model
Troubleshooting custom Runtime addons
Troubleshooting for Cloudera AI Metrics Collector Service
Troubleshooting for ML Runtimes
Troubleshooting for the Administrator
Troubleshooting for the Data Scientist
Troubleshooting issues with workloads
Troubleshooting Kerberos issues
Troubleshooting: 401 Unauthorized
Troubleshooting: 401 Unauthorized when accessing Hive or Impala virtual warehouses
Troubleshooting: Empty data page
Troubleshooting: Existing connection name
Turning off ML Runtimes Addons
Updating ML Runtime images on Cloudera AI installations
Updating proxy configuration in an existing workbench
Upgrading Cloudera AI Workbenches
Uploading and working with local files
Usage guidelines for deploying models with Cloudera AI
User Roles
User Roles
User Roles and Team Accounts
Using an External NFS Server
Using an MLflow Model Artifact in a Model REST API
Using Cloudera AI Registry
Using Conda Runtime
Using Conda to manage dependencies
Using data connection snippets
Using Editors for ML Runtimes
Using GPUs for Cloudera AI projects
Using JDBC Connection with PySpark
Using JupyterLab with ML Runtimes
Using ML Runtimes addons
Using ML Runtimes with cdswctl
Using MLflow SDK to register customized models
Using Runtime Catalog
Using Spark 2 from Python
Using Spark 2 from Scala
Using Spark 3 from R
Viewing details for Cloudera AI Registries
Viewing Details for Cloudera AI Registry
Viewing Job History
Viewing lineage for a model deployment in Atlas
Viewing replica logs for a model
Visualization in Cloudera AI quota management
Visualizations
Visualizing Experiment Results
Web session timeouts
What's New
What's New in ML Runtimes older releases
What's New in ML Runtimes version 2024.02.1
What's New in ML Runtimes version 2024.05.1
What's New in ML Runtimes version 2024.05.2
What's New in ML Runtimes version 2024.10.1
Workarounds for Cloudera Data Visualization with Hive and Impala
Workbench backup and restore prerequisites
Workbench editor file types
Worker network communication
Workers API
Workflows for active Models
Working with Data Discovery and Visualization
Yunikorn Gang scheduling
«
(Optional) Using VS Code with Git integration
▶︎
Managing Projects
Creating a Project
Creating a project from a password-protected Git repo
Configuring Project-level Runtimes
Adding Project Collaborators
Modifying Project settings
Managing Project Files
Deleting a Project
▶︎
Native Workbench Console and Editor
Launch a Session
Run Code
Access the Terminal
Stop a Session
Workbench editor file types
Environmental Variables
▼
Third-party Editors
Modes of configuration
▶︎
Configure a browser-based IDE as an Editor
Testing a browser-based IDE in a Session
Configuring a browser-based IDE at the Project level
Legacy Engine level configuration
Configuring a local IDE using an SSH gateway
▶︎
Configure PyCharm as a local IDE
Add Cloudera AI as an Interpreter for PyCharm
Configure PyCharm to use Cloudera AI as the remote console
(Optional) Configure the Sync between Cloudera AI and PyCharm
▼
Configure VS Code as a local IDE
Download cdswctl and add an SSH Key
Initialize an SSH connection to Cloudera AI for VS code
Setting up VS Code
(Optional) Using VS Code with Python
(Optional) Using VS Code with R
(Optional) Using VS Code with Jupyter
(Optional) Using VS Code with Git integration
Limiting files in Explorer view
▶︎
Git for Collaboration
Linking an existing Project to a Git remote
▶︎
Embedded Web Applications
Example: A Shiny Application
Example: Flask application
»
Managing Projects
(Optional) Using VS Code with Git integration
VS Code has substantial Git integration.
If you created your project from a git repo or a custom template, your changes and outside changes made to the repo will automatically appear.
Parent topic:
Configure VS Code as a local IDE
Feedback
We want your opinion
How can we improve this page?
What kind of feedback do you have?
I like something
I have an idea
Something's not working
Can we contact you for follow-up on this?
Back
Submit
OK
1.5
1.5.4
1.5.3
1.5.2
1.5.1
1.5.0
1.4
1.4.1
1.4.0