Configuring Cloudera Data Science Workbench Deployments Behind a Proxy
HTTP_PROXY="<http://proxy_host>:<proxy-port>" HTTPS_PROXY="<http://proxy_host>:<proxy_port>"
HTTP_PROXY="http://localhost:3128" HTTPS_PROXY="http://localhost:3128"
If the proxy server uses TLS encryption to handle connection requests, you will need to add the proxy's root CA certificate to your host's store of trusted certificates. This is because proxy servers typically sign their server certificate with their own root certificate. Therefore, any connection attempts will fail until the Cloudera Data Science Workbench host trusts the proxy's root CA certificate. If you do not have access to your proxy's root certificate, contact your Network / IT administrator.
-
Copy the proxy's root certificate to the trusted CA certificate store (ca-trust) on the Cloudera Data Science Workbench host.
cp /tmp/<proxy-root-certificate>.crt /etc/pki/ca-trust/source/anchors/
-
Use the following command to rebuild the trusted certificate store.
update-ca-trust extract
-
If you will be using custom engine images that will be pulled from a Docker repository, add the proxy's root certificates to a directory under /etc/docker/certs.d. For example, if your Docker repository is at docker.repository.mycompany.com, create the following directory structure:
/etc/docker/certs.d |-- docker.repository.mycompany.com # Directory named after Docker repository |-- <proxy-root-certificate>.crt # Docker-related root CA certificates
This step is not required with the standard engine images because they are included in the Cloudera Data Science Workbench RPM.
-
Re-initialize Cloudera Data Science Workbench to have this change go into effect.
cdsw init
Configure hostnames to be skipped from the proxy
Use the Cloudera Manager CDSW service's No Proxy property to configure a comma-separated list of hostnames that should be skipped from the proxy. On an RPM deployment, you would configure the corresponding NO_PROXY field in cdsw.conf.
The value for this field typically includes 127.0.0.1, localhost, the Master node IP address (configured as part of the installation process), and any private Docker registries and HTTP services inside the firewall that Cloudera Data Science Workbench users might want to access from the engines. This change must be made on the master and on all the worker nodes.
127.0.0.1,localhost,<CDSW_MASTER_NODE_IP>,100.66.0.1, 100.66.0.2,100.66.0.3,100.66.0.4,100.66.0.5,100.66.0.6,100.66.0.7,100.66.0.8, 100.66.0.9,100.66.0.10,100.66.0.11,100.66.0.12,100.66.0.13,100.66.0.14, 100.66.0.15,100.66.0.16,100.66.0.17,100.66.0.18,100.66.0.19,100.66.0.20, 100.66.0.21,100.66.0.22,100.66.0.23,100.66.0.24,100.66.0.25,100.66.0.26, 100.66.0.27,100.66.0.28,100.66.0.29,100.66.0.30,100.66.0.31,100.66.0.32, 100.66.0.33,100.66.0.34,100.66.0.35,100.66.0.36,100.66.0.37,100.66.0.38, 100.66.0.39,100.66.0.40,100.66.0.41,100.66.0.42,100.66.0.43,100.66.0.44, 100.66.0.45,100.66.0.46,100.66.0.47,100.66.0.48,100.66.0.49,100.66.0.50, 100.77.0.129,100.77.0.130