Provisioning Cloudera AI Workbenches

This topic describes how to provision Cloudera AI Workbenches.

The first user to access the Cloudera AI Workbench after it is created must have both the MLAdmin role and the EnvironmentAdmin account role assigned. See Configuring User Access to Cloudera AI and Understanding account roles and resource roles for information about this resource role.

Log in to the Cloudera AI web interface.

On on cloud, log in to https://console.cdp.cloudera.com using your corporate credentials or any other credentials that you received from your Cloudera administrator.
Click Cloudera AI Workbenches.
Click Provision Workbench.
Fill out the following fields.
- Workbench Name - Give the Cloudera AI Workbench a name. For example, user1_dev. Do not use capital letters in the workbench name.
- Select Environment - From the dropdown, select the environment where the Cloudera AI Workbenches must be provisioned. If you do not have any environments available to you in the dropdown, contact your Cloudera administrator to gain access.
  note
  You cannot choose an environment when the Environment or associated DataLake and FreeIPA is not in an available or running state.
- Existing NFS - (Azure only) Enter the mount path from the environment creation procedure.
- NFS Protocol version - (Azure only) Specify the protocol version to use when communicating with the existing NFS server.
Switch the toggle to display Advanced Settings.
1. CPU Settings - From the dropdown, select the following:
  - Instance Type: You must select an instance type that is supported by Cloudera AI, or the associated validation check will fail (See Other Settings, below).
  - Autoscale Range
  - Root Volume Size: If necessary, you can also change the default size of the root volume disk for the nodes in the group.
2. GPU Settings - Click the GPU Instances toggle to enable GPUs for the cluster, and set the following:
  - Instance Type: You must select an instance type that is supported by Cloudera AI, or the associated validation check will fail (See Other Settings, below).
  - Autoscale Range
  - Root Volume Size: If necessary, you can also change the default size of the root volume disk for the nodes in the group.
  note
  In addition to the CPU and GPU instances selected here, Cloudera AI also provisions two extra CPU instance groups to run infrastructure pods for the Cloudera AI Workbenches, as follows:
  AWS:
  
  Cloudera AI infrastructure node group: m5.2xlarge, with an autoscale range of 2 to 3.
  
  Platform infrastructure node group: m5.large, with an autoscale range of 2 to 4.
  
  Azure:
  
  Cloudera AI infrastructure node group: Standard_D3s_v2, with an autoscale range of 2 to 3.
  
  Platform infrastructure node group: Standard_D2s_v3, with an autoscale range of 2 to 4.
  
  These are not configurable by users.
3. Kubernetes Config - Upload or directly enter the Kubernetes config information.
4. Network Settings
  - Subnets for Worker Nodes: (AWS only) Optionally select one or more subnets to use for Kubernetes worker nodes.
  - Subnets for Load Balancer: Optionally select one or more subnets to use for the Load Balancer.
  - Load Balancer Source Ranges: (Azure only) Enter a CIDR range of IP addresses allowed to access the cluster.
    - If the Cloudera AI Workbench is provisioned with public access, enter the allowed public IP address range.
    - If the Cloudera AI Workbench is provisioned with private access, enter the allowed private IP address range.
    note
    When you change the Load Balancer Source Range setting, the changes are propagated to both the deployed Load Balancer in EKS and the underlying Security Group (SG).
  - Enable Fully Private Cluster: This Preview Feature provides a simple way to create a secure cluster. Only available in AWS environments in Cloudera.
  - Enable Public IP Address for Load Balancer
    (AWS only) You can create a load balancer with a public IP address for the private cluster. This is useful in cases where there is no VPN between the Cloudera AI VPC and the customer network. In this case, the connection is over the internet.
    note
    In this network configuration, to use kubectl commands in the private cluster, you need to execute the commands in a network that is peered with the cluster VPN. Enabling the public IP address for the load balancer is not sufficient to allow kubectl commands to work.
  - Restrict access to Kubernetes API server to authorized IP ranges
    You can specify a range of IP addresses in CIDR format that are allowed to access the Kubernetes API server. By default, the Kubernetes API services of Cloudera AI Workbenches are accessible to all public IP addresses (0.0.0.0/0) that have proper credentials.
    To specify an address to authorize, enter an address in CIDR format (for example, 1.0.0.0/0) in API Server Authorized IP Ranges, and click the plus (+) icon. In this case, the API server is accessible by the user-provided address as well as control-plane-exit-ips over the public internet.
    If the feature is enabled and no IP authorized addresses are specified, then the Kubernetes API server is only accessible by control-plane-exit-ips from the public internet.
    
    note
    Both the Amazon EKS and Azure AKS have a quota or upper limit for the maximum number of public endpoint access CIDR ranges per cluster. See the Amazon EKS service quotas or the Azure AKS documentation for more details. When the feature is enabled, the Cloudera Control Plane exit IP addresses will be automatically added to the authorized IP ranges for accessing the Kubernetes API server for Cloudera operations, which will use three CIDR blocks against the per-cluster limit.
  - Use hostname for a non-transparent proxy
    Enter a CIDR range allowed for non-transparent proxy server access to the cluster.
5. Production Cloudera AI
  - Enable Governance - Must be enabled to capture and view information about your Cloudera AI projects, models, and builds from Apache Atlas for a given environment. If you do not select this option, then integration with Atlas will not work.
  - Enable Model Metrics - When enabled, stores metrics in a scalable metrics store, enables you to track individual model predictions, and also track and analyze metrics using custom code.
6. Other Settings
  - Enable TLS - Select this checkbox if you want the workbench to use HTTPS for web communication.
  - Enable public Internet access - When enabled, the Cloudera AI Workbench will be available on the public Internet. When disabled, it is assumed that connectivity is achieved through a corporate VPC.
  - Enable Monitoring - Administrators (users with the MLAdmin role) can use a Grafana dashboard to monitor resource usage in the provisioned workbench.
  - Skip Validation - If selected, validation checks are not performed before a workbench is provisioned. Select this only if validation checks are failing incorrectly.
  - Tags - Tags added to cloud infrastructure, compute, and storage resources associated with this Cloudera AI Workbench.
    Note that these tags are propagated to your cloud service provider account. See Related information for links to AWS and Azure tagging strategies.
  - Cloudera AI Static Subdomain - This is a custom name for the workbench endpoint, and it is also used for the URLs of models, applications, and experiments. You can create or restore a workbench to this same endpoint name, so that external references to the workbench do not have to be changed. Only one workbench with the specific subdomain endpoint name can be running at a time.
    note
    The endpoint name can have a maximum of 15 characters, using alphanumerics and hyphen or underscore only, and must start and end with an alphanumeric character.
  - Maximize IOPS and throughput of the root volumes - (AWS only) If selected, the root volumes attached to the worker nodes in AWS will have its IOPS and throughput set to the maximum values, that is, 16000 IOPS and 1000 MIB/s throughput. This will incur additional charges from AWS.
Click Provision Workbench.

It can take up to an hour for an Cloudera AI Workbench to be provisioned and installed. Once the status changes to show that the workbench has been successfully provisioned, click on the workbench name to go to the web application.

Note that the domain name for the provisioned workbench is randomly generated and cannot be changed.

Grant users access to this Cloudera AI Workbench using the instructions at Configuring User access to Cloudera AI.

Provisioning Cloudera AI Workbenches

We want your opinion

How can we improve this page?