Deploying a Cloudera Machine Learning Workspace with support for TLS

You can provision a Cloudera Machine Learning Workspace with TLS enabled both on Cloudera Embedded Container Service and on OpenShift Container Platform (OCP), so that it can be accessed via https.

You need to obtain a certificate from the Certificate Authority used by your organization. This may be an internal certificate authority. Additionally, you need a computer with CLI access to the cluster, and with kubectl installed.

The workspace subdomain is either the static subdomain the user elects or it can also be a workspace endpoint name that the deployment autogenerates. Also note that app_domain is defined at the Data Services deployment.

A workspace name has the following format: https://[***WORKSPACE-SUBDOMAIN***].APPS.[***APP_DOMAIN***].com.

Workloads created in a Cloudera Machine Learning Workspace are containers provisioned in Kubernetes and must be addressable to the user. To do this, Cloudera Machine Learning creates a unique subdomain.

The URL for the workload is structured as: https://[***WORKLOAD-ENDPOINTS***].[***WORKSPACE-SUBDOMAIN***].APPS.[***APP_DOMAIN***].com.

As the workload endpoints are randomly generated, for TLS to work, a Cloudera Machine Learning Workspace needs to have a wildcard SAN entry in the TLS certificate and additionally we need a workspace subdomain SAN as well:

Wildcard SAN entry: SAN:*.[***WORKSPACE-SUBDOMAIN***].APPS.[***APP_DOMAIN***].com.

Workspace subdomain SAN: [***WORKSPACE-SUBDOMAIN***].APPS.[***APP_DOMAIN***].com.

See the following example for creating a Cloudera Machine Learning Workspace with static subdomain in Cloudera Embedded Container Service environment:

  • the user's domain is: mycompany.com (user-provided)
  • a non-HA deployment's master's hostname is: ecsmst01 (inherits hostname)
  • the user's control plane deployment is: cdp-dev (user-provided)
  • the user's load-balanced endpoint for the control plane deployment is: cdp-lb (user-provided)
  • the apps subdomain is hard-coded as: apps (hardcoded)
  • the Cloudera Machine Learning Workspace ID is generated as: ml-1234abc-123 (auto-generated)
  • the Cloudera Machine Learning static subdomain is set as: cmlstatic (user-provided)

With the above details, consider the following examples:

Table 1. Cloudera Machine Learning Workspace environment examples
Network topology Domain set Example
Control Plane High Availability (HA)

HA:

app_domain = cdp-lb.mycompany.com

HA applications:

*.apps.cdp-lb.mycompany.com

Non-HA, with custom deployment domain set

non-HA with the custom Cloudera Embedded Container Service domain:

app_domain = cdp-dev.mycompany.com

non-HA with the custom Cloudera Embedded Container Service applications:

*.apps.cdp-dev.mycompany.com

Non-HA, with no custom deployment domain set

non-HA without custom Cloudera Embedded Container Service domain:

app_domain = ecsmst01.mycompany.com

non-HA without custom Cloudera Embedded Container Service domain applications:

*.apps.ecsmst01.mycompany.com*.

Cloudera Machine Learning Workspace without static subdomain High Availability (HA)

Cloudera Machine Learning Workspace on HA Cloudera Embedded Container Service without static subdomain:

[*.]ml-1234abc-123.apps.cdp-lb.mycompany.com

Non-HA with user's domain

Cloudera Machine Learning Workspace on non-HA Cloudera Embedded Container Service without custom Cloudera Embedded Container Service domain without Cloudera Machine Learning Workspace static subdomain:

[*.]ml-1234abc-123.apps.cdp-dev.mycompany.com

Non-HA without user's domain

Cloudera Machine Learning Workspace on non-HA Cloudera Embedded Container Service without custom Cloudera Embedded Container Service domain without Cloudera Machine Learning Workspace static subdomain:

[*.]ml-1234abc-123.apps.ecsmst01.mycompany.com

Cloudera Machine Learning Workspace with static subdomain High Availability (HA)

Cloudera Machine Learning Workspace on HA Cloudera Embedded Container Service with Cloudera Machine Learning Workspace static subdomain:

[*.]cmlstatic.apps.cdp-lb.mycompany.com

Non-HA with user's domain

Cloudera Machine Learning Workspace on non-HA Cloudera Embedded Container Service with custom Cloudera Embedded Container Service domain with Cloudera Machine Learning Workspace static domain:

[*.]cmlstatic.apps.cdp-dev.mycompany.com

Non-HA without user's domain

Cloudera Machine Learning Workspace on non-HA Cloudera Embedded Container Service without custom Cloudera Embedded Container Service domain with Cloudera Machine Learning Workspace static domain:

[*.]cmlstatic.apps.ecsmst01.mycompany.com

By using unique subdomains, the Cloudera Machine Learning Workspace is able to securely serve each interactive workload with proper isolation and protect it from code injection attacks such as Cross Site Scripting.

  1. Provision the Cloudera Machine Learning Workspace.

    Follow the procedure in Provisioning an Cloudera Machine Learning Workspace.

  2. Obtain the .crt and .key files for the certificate from your Certificate Authority.
    The certificate URL follows the following format: [***WORKSPACE-SUBDOMAIN***].APPS.[***APP_DOMAIN***].com

    Example

    An example URL for the certificate shall be as: cml.apps.cdp.mycompany.com.

    Check that the certificate shows the corresponding Common Name (CN) and Subject Alternative Names (SAN) correctly:

    • CN: cml.apps.cdp.mycompany.com
    • SAN: *.cml.apps.cdp.mycompany.com
    • SAN: cml.apps.cdp.mycompany.com
  3. Create a Kubernetes secret inside the previously provisioned Cloudera Machine Learning Workspace namespace.
    The certificate is automatically uploaded. Login to the Cloudera Embedded Container Service to accomplish these steps:
    1. cd /opt/cloudera/parcels/Cloudera Embedded Container Service/bin/
    2. ./cml_utils.sh -h

      Optional: A helper prompt appears, with explanation for the next command.

    3. ./cml_utils.sh upload-cert -n [***NAMESPACE***] -c <path_to_cert> -k <path_to_key>

      For example: ./cml_utils.sh upload-cert -n bb-tls-1 -c /tmp/ws-cert.crt -k /tmp/ws-key.key

    1. Name the secret cml-tls-secret.
    2. Run this command on a machine with access to the .crt and .key files, and access to the cluster: kubectl create secret tls cml-tls-secret --cert=<pathtocrt.crt> --key=<pathtokey.key> -o yaml --dry-run | kubectl -n [***CML WORKSPACE-NAMESPACE***] create -f -

      You can replace or update certificates in the secret at any time.

  4. In Site Administration > Security > Root CA configuration, add the root CA certificate to the workspace.
    For example: https://cml.apps.cdp.mycompany.com.
The procedure creates routes to reflect the new state of ingress and secret, and enables TLS.