Enabling TLS/SSL for Cloudera Data Science Workbench

Cloudera Data Science Workbench uses HTTP and WebSockets (WS) to support interactive connections to the Cloudera Data Science Workbench web application. However, these connections are not secure by default. This topic describes how you can use TLS/SSL to secure connections between your browser and the Cloudera Data Science Workbench web application.

Transport Layer Security (TLS) is an industry standard set of cryptographic protocols for securing communications over a network. TLS evolved from Secure Sockets Layer (SSL, which remains part of the name for historical reasons). TLS/SSL provides privacy and data integrity between applications communicating over a network by encrypting the packets transmitted between endpoints.

You can use TLS/SSL to enforce secure encrypted connections, using HTTPS and WSS (WebSockets over TLS), to the Cloudera Data Science Workbench web application. Specifically, Cloudera Data Science Workbench can be configured to use a TLS termination proxy to handle incoming connection requests. The termination proxy server will decrypt incoming connection requests and forwards them to the Cloudera Data Science Workbench web application.

A TLS termination proxy can be internal or external. An internal termination proxy will be run by Cloudera Data Science Workbench's built-in load balancer, called the ingress controller, on the master node. The ingress controller is primarily responsible for routing traffic and load balancing between Cloudera Data Science Workbench's web service backend. Once configured, as shown in the following instructions, it will start terminating HTTPS traffic as well. External termination can be done at an external load balancer such as the AWS Elastic Load Balancer.

Related topic: Troubleshooting TLS/SSL Errors

Private Key and Certificate Requirements

The TLS certificate issued by your CA must list both, the Cloudera Data Science Workbench DOMAIN (set in cdsw.conf), as well as a wildcard for all first-level subdomains. For example, if DOMAIN is set to cdsw.company.com, then the TLS certificate must include both cdsw.company.com and *.cdsw.company.com.

Creating a Certificate Signing Request (CSR)

Use the following steps to create a Certificate Signing Request (CSR) to submit to your CA. Make sure you use openssl, and not keytool, to perform these steps. Keytool does not support a wildcard Subject Alternative Name (SAN) and cannot create flat files.
  1. Create a cdsw.cnf file and populate it with the required configuration parameters including the SAN field values.
    vi cdsw.cnf
  2. Copy and paste the default openssl.cnf from: http://web.mit.edu/crypto/openssl.cnf.
  3. Modify the following sections and save the cdsw.cnf file:
    [ CA_default ]
    default_md = sha2
    
    [ req ]
    default_bits       = 2048
    distinguished_name = req_distinguished_name
    req_extensions     = req_ext
    
    [ req_distinguished_name ]
    countryName                 = Country Name (2 letter code)
    stateOrProvinceName         = State or Province Name (full name)
    localityName               = Locality Name (eg, city)
    organizationName           = Organization Name (eg, company)
    commonName                 = Common Name (e.g. server FQDN or YOUR name)
    
    [ req_ext ]
    subjectAltName = @alt_names
    
    [alt_names]
    DNS.1   = *.cdsw.company.com
    DNS.2   = cdsw.company.com
    Key points to note:
    • The domains set in the DNS.1 and DNS.2 entries above must match the DOMAIN set in cdsw.conf.
    • The default_md parameter must be set to sha256 at a minimum. Older hash functions such as SHA1 are deprecated and will be rejected by browsers, either currently or in the very near future.
    • The commonName (CN) parameter will be ignored by browsers. You must use Subject Alternative Names.
  4. Run the following command to generate the CSR.
    openssl req -out cert.csr -newkey rsa:2048 -nodes -keyout private.key -config cdsw.cnf
    This command generates the private key and the CSR in one step. The -nodes switch disables encryption of the private key (which is not supported by Cloudera Data Science Workbench at this time).
  5. Run the following command to verify that the certificate issued by the CA lists both the required domains, cdsw.company.com and *.cdsw.company.com, under X509v3 Subject Alternative Name.
    openssl x509 -in <your_tls_cert>.crt -noout -text
    You should also verify that a valid hash function is being used to create the certificate. For SHA-256, the value under Signature Algorithm will be sha256WithRSAEncryption.

Internal Termination

Internal TLS termination must be configured during the installation process and is governed by the following variables in cdsw.conf.
  • TLS_ENABLE - When set to true, this property enforces HTTPS and WSS connections. The server will now redirect any HTTP request to HTTPS and generate URLs with the appropriate protocol.
  • TLS_KEY - Set to the path of the TLS private key.
  • TLS_CERT - Set to the path of the TLS certificate.

    Certificates and keys must be in PEM format.

External Termination

External TLS termination must be configured during the installation process and is governed by the TLS_ENABLE variable in cdsw.conf.
  • TLS_ENABLE - When set to true, this property enforces HTTPS and WSS connections. The server will now redirect any HTTP request to HTTPS and generate URLs with the appropriate protocol.

    The TLS_KEY and TLS_CERT properties must be left blank.

Many load balancers and proxies require an URL they can ping to validate the status of the web service backend. For instance, you can configure a load balancer to send an HTTP GET request to /internal/load-balancer/health-ping. If the response is 200 (OK), that means the backend is healthy. Note that, as with all communication to the web backend from the load balancer when TLS is terminated externally, this request should be sent over HTTP and not HTTPS.

Limitations

  • Communication within the Cloudera Data Science Workbench cluster is not encrypted.

  • Cloudera Data Science Workbench does not support encrypted private keys.

  • Troubleshooting can be difficult because browsers do not typically display helpful security errors with WebSockets. Often they will just silently fail to connect.

  • Self-signed certificates - In general, browsers do not support self-signed certificates for WSS. Your certificate must be signed by a Certificate Authority (CA) that your users’ browsers will trust. Cloudera Data Science Workbench will not function properly if browsers silently abort WebSockets connections.

    If you are using a TLS certificate that has been used to sign itself, and is not signed by a CA in the trust store, then the browser will display a dialog asking if you want to trust the certificate provided by Cloudera Data Science Workbench. This means you are using a self-signed certificate, which is not supported and will not work. In this case WSS connections will likely be aborted silently, regardless of your response (Ignore/Accept) to the dialog.

    As long as you have a TLS certificate signed by a CA certificate in the trust store, it will be supported and will work with Cloudera Data Science Workbench. For example, if you need to use a certificate signed by your organization's internal CA, make sure that all your users import your root CA certificate into their machine’s trust store. This can be done using the Keychain Access application on Macs or the Microsoft Management Console on Windows.