Troubleshooting Installation Issues
This section describes solutions to some warnings you might encounter during the installation process.
Review installation logs
When debugging installation and "first run" issues with Cloudera Data Science Workbench, it
is important to review the "role" logs, which are logged by the Cloudera Manager Agent when
the services tries to start. These logs will show any issues that occur with the actual
host, before the kubernetes and docker systems even start. These logs are located at:
/var/run/cloudera-scm-agent/process/[XXX-ROLE]/logs.
These logs are sorted based on the role type (master, application, docker-daemon, worker)
and are prepended with an incremental ID so you can find the latest log. When viewing these
logs, you can ignore any line that begins with a +
symbol, so the best way
to view these would be, for example: grep -v ^+
/var/run/cloudera-scm-agent/process/67-CDSW_DOCKER/logs/stderr.log
Stop any existing CDSW processes
If CDSW was not shut down properly through Cloudera Manager, then sometimes old CDSW unix processes can be left intact and can cause problems with starting the application. To ensure this is not the case:
- Stop all CDSW roles from Cloudera Manager
- Run the following command on all CDSW nodes to kill all CDSW, Docker, and Kubernets
proceses, including the stale processes:
for i in `ps -ef |grep -e cdsw -e docker -e kube|egrep -v grep|awk '{print $2}'`; do kill -9 $i; done
- Start the CDSW roles from Cloudera Manager again
Illegal IP address
Exception encountered: [illegal IP address string passed to inet_pton]
ERROR:: Unable validate IP address [b'10.10.xx.xx'].: 1
ERROR:: Unable to disable export for [b'10.17.xx.xx'].: 1
This can be resolved by adding the IP address of the Master node in the Cloudera Manager > CDSW > Configuration >Master Node IPv4 Address
Stop running security software
Preexisting iptables rules not supported
WARNING: Cloudera Data Science Workbench requires iptables, but does not support preexisting iptables rules.
Kubernetes makes extensive use of iptables
. However, it’s hard to know how pre-existing iptables
rules will interact with the rules
inserted by Kubernetes. Therefore, Cloudera recommends you run the following commands to
clear all pre-existing rules before you proceed with the
installation.sudo iptables -P INPUT ACCEPT
sudo iptables -P FORWARD ACCEPT
sudo iptables -P OUTPUT ACCEPT
sudo iptables -t nat -F
sudo iptables -t mangle -F
sudo iptables -F
sudo iptables -X
The warning can be ignored after you clear the pre-existing
rules or are sure that there are no pre-existing iptables rules.Remove the entry corresponding to /dev/xvdc from /etc/fstab
Cloudera Data Science Workbench installs a custom filesystem on its Application and Docker block devices. These filesystems will be used to store user project files and Docker engine images respectively. Therefore, Cloudera Data Science Workbench requires complete access to the block devices. To avoid losing any existing data, make sure the block devices allocated to Cloudera Data Science Workbench are reserved only for the workbench.
Linux sysctl kernel configuration errors
Kubernetes and Docker require non-standard kernel configuration.
Depending on the existing state of your kernel, this might result in sysctl
errors such as:
sysctl net.bridge.bridge-nf-call-iptables must be set to 1
This is because the settings in /etc/sysctl.conf
conflict with the settings required by Cloudera Data Science
Workbench. Cloudera cannot make a blanket recommendation on how to resolve such errors
because they are specific to your deployment. Cluster administrators may choose to either
remove or modify the conflicting value directly in /etc/sysctl.conf
, remove the value from the conflicting configuration file, or
even delete the module that is causing the conflict.
/etc/sysctl.conf
.
SYSTEMD_LOG_LEVEL=debug /usr/lib/systemd/systemd-sysctl
You will see output similar to:
Parsing /usr/lib/sysctl.d/00-system.conf
Parsing /usr/lib/sysctl.d/50-default.conf
Parsing /etc/sysctl.d/99-sysctl.conf
Overwriting earlier assignment of net/bridge/bridge-nf-call-ip6tables in file '/etc/sysctl.d/99-sysctl.conf'.
Overwriting earlier assignment of net/bridge/bridge-nf-call-ip6tables in file '/etc/sysctl.d/99-sysctl.conf'.
Overwriting earlier assignment of net/bridge/bridge-nf-call-ip6tables in file '/etc/sysctl.d/99-sysctl.conf'.
Parsing /etc/sysctl.d/k8s.conf
Overwriting earlier assignment of net/bridge/bridge-nf-call-iptables in file '/etc/sysctl.d/k8s.conf'.
Parsing /etc/sysctl.conf
Overwriting earlier assignment of net/bridge/bridge-nf-call-ip6tables in file '/etc/sysctl.conf'.
Overwriting earlier assignment of net/bridge/bridge-nf-call-ip6tables in file '/etc/sysctl.conf'.
Setting 'net/ipv4/conf/all/promote_secondaries' to '1'
Setting 'net/ipv4/conf/default/promote_secondaries' to '1'
...
/etc/sysctl.d/k8s.conf
is the configuration added by Cloudera Data Science Workbench. Administrators must make sure
that no other file is overwriting values set by /etc/sysctl.d/k8s.conf
.
CDH parcels not found at /opt/cloudera/parcels
- If you are using a custom parcel directory, you can ignore the warning and proceed with the installation. Once the Cloudera Data Science Workbench is running, set the path to the CDH parcel in the admin dashboard.
- This warning can be an indication that you have not added gateway roles to the Cloudera Data Science Workbench hosts. In this case, do not ignore the warning. Exit the installer and go to Cloudera Manager to add gateway roles to the cluster.
CDSW docker daemons fail to start
Error starting daemon: error initializing graphdriver: devmapper: Unable to take ownership of thin-pool (docker-thinpool) that already has used data blocks.
This issue occurs when the block devices you specified for the Docker Block Device field already have data
on them. This is a safeguard to prevent block devices from being wiped inadvertently. Note
that resolving this resolving this issue involves deleting data from the block devices.- Verify that it is okay to delete the data on the block device.
- SSH to the Cloudera Data Science Workbench master host.
- Run the following
script:
/opt/cloudera/parcels/CDSW/scripts/teardown-docker.sh
- In the Cloudera Manager Admin Console, select the Cloudera Data Science Workbench service.
- On the Instances tab, select the Docker Daemons.
- Click .
- Start the Cloudera Data Science Workbench service by clicking .
User Process Limit
{WARN} Cloudera Data Science Workbench recommends that all users have a max-user-processes limit of at least 65536.
ulimit -u 65536
Set
this configuration on every Cloudera Data Science Workbench host. You can also edit
/etc/security/limits.conf to
configure the user process limit. Open Files Limit
{WARN} Cloudera Data Science Workbench recommends that all users have a max-open-files limit set to 1048576.
This message appears if the open files limit is under 1048576. Note that on HDP clusters, the open file limit recommendation is 10000 at a minimum. Cloudera recommends a higher limit for clusters with Cloudera Data Science Workbench.
ulimit -n 1048576
Set this configuration on every Cloudera Data Science Workbench host. You can also edit
/etc/security/limits.conf to
configure the open files limit.Disable SE Linux
During installation, you may encounter the following message:
Please disable SELinux by setting SELINUX=disabled|permissive in /etc/selinux/config, then reboot or using setenforce 0 command"
SELinux enforces additional control policies for what a user, process, or daemon can do. If SELinux is enabled or not in permissive mode, Cloudera Data Science Workbench may not have the proper permissions to run.
To resolve this issue, you must change the SELinux mode on every host by doing one of the following:
-
Edit the configuration file for SELinux and set it to disabled or permissive. Note that if you set SELinux to permissive mode, events such as access denials will be logged, but the denial will not be enforced. You can find the SELinux configuration file in the following location: /etc/selinux/config.
- Run the following command:
setenforce 0
. This command disables SELinux completely.
DNS is not configured properly
During installation, you might encounter the messages such as:
DNS doesn't resolve <CDSW_domain> to <CDSW_Master_IP_address>; DNS is not configured properly
or
DNS doesn't resolve <CDSW_Master_IP_address> to <CDSW_domain>; DNS is not configured properly"
This indicates that the CDSW domain name configured does not resolve to the IP address of the Master host. You must enable DNS forward and reverse lookup for the CDSW domain and IP address to proceed.