Networking and Security Requirements
It is important to note networking and security requirements for Cloudera Data Science Workbench.
- All Cloudera Data Science Workbench gateway hosts must be part of the same datacenter and use the same network. Hosts from different data-centers or networks can result in unreliable performance.
- A wildcard subdomain such as
*.cdsw.company.com
must be configured. Wildcard subdomains are used to provide isolation for user-generated content.The wildcard DNS hostname configured for Cloudera Data Science Workbench must be resolvable from both, the CDSW cluster, and your browser.
- Disable all pre-existing
iptables
rules. While Kubernetes makes extensive use ofiptables
, it’s difficult to predict how pre-existing iptables rules will interact with the rules inserted by Kubernetes. Therefore, Cloudera recommends you to disable all pre-existing rules before you proceed with the installation.It is recommended to save theiptables
and check whether the changes have been written to the /etc/sysconfig/iptables file before you disable them. If you disable theiptables
without saving, then the settings can get erased upon system reboot.- Save the
iptables
by running the following command:service iptables save
- Verify whether the changes have been written to the file by running the following
command:
ls -l /etc/sysconfig/iptables
- Disable the
iptables
by running the following commands:sudo iptables -P INPUT ACCEPT sudo iptables -P FORWARD ACCEPT sudo iptables -P OUTPUT ACCEPT sudo iptables -t nat -F sudo iptables -t mangle -F sudo iptables -F sudo iptables -X
- Save the
- Cloudera Data Science Workbench sets the following
sysctl
options in/etc/sysctl.d/k8s.conf
:-
net.bridge.bridge-nf-call-iptables=1
-
net.bridge.bridge-nf-call-ip6tables=1
-
net.ipv4.ip_forward=1
-
net.ipv4.conf.default.forwarding=1
/etc/sysctl.conf
. -
- SELinux must either be disabled or run in permissive mode.
- Multi-homed networks are supported with Cloudera Data Science
Workbench 1.2.2 (and higher). However, you will need to explicitly
configure the private IP address of the worker nodes in the kubelet
start script as follows:
# vi /opt/cloudera/parcels/CDSW/scripts/start-kubelet-worker-standalone-core.sh 88 kubelet_opts+=(--v=2) 89 kubelet_opts+=(--node-ip=172.x.x.x)
- Firewall restrictions must be disabled across Cloudera Data Science Workbench and CDH/HDP cluster hosts. For more details on cluster communication, see Ports Required by Cloudera Data Science Workbench.
- Untrusted (non-sudo) SSH access to Cloudera Data Science Workbench
hosts must be disabled to ensure a secure deployment.
Cloudera Data Science Workbench assumes that users only access the gateway hosts through the web application. Untrusted users with SSH access to a Cloudera Data Science Workbench host can gain full access to the cluster, including access to other users' workloads.
-
localhost
must resolve to127.0.0.1
. - Forward and reverse DNS lookup must be enabled for the Cloudera Data Science Workbench domain name and IP address (CDSW master host).
- Cloudera Data Science Workbench does not support DNS
servers running on
127.0.0.1:53
. This IP address resolves to the container localhost within Cloudera Data Science Workbench containers. As a workaround, use either a non-loopback address or a remote DNS server. - All third-party security software (such as McAfee, Tanium, Symantec, etc.) must be disabled on CDSW hosts. Failure to do so can result in Cloudera Data Science Workbench failing randomly. After CDSW is started, you should be able to re-enable the security software.
Cloudera Data Science Workbench does not support hosts or clusters that do not conform to these restrictions.