Issues Fixed in Cloudera Data Science Workbench 1.4.2
The current release of Cloudera Data Science Workbench includes fixes for bugs.
TSB-346: Risk of Data Loss During Cloudera Data Science Workbench (CDSW) Shutdown and Restart
Stopping Cloudera Data Science Workbench involves unmounting the NFS volumes that store CDSW project directories and then cleaning up a folder where the kubelet stores its temporary state. However, due to a race condition, this NFS unmount process can take too long or fail altogether. If this happens, CDSW projects that remain mounted will be deleted by the cleanup step.
Products affected: Cloudera Data Science Workbench
1.3.0, 1.3.1
1.4.0, 1.4.1
Users affected: This potentially affects all CDSW users.
Detected by: Nehmé Tohmé (Cloudera)
Severity (Low/Medium/High): High
Impact: If the NFS unmount fails during shutdown, data loss can occur. All CDSW project files might be deleted.
Immediate action required: If you are running any of the affected Cloudera Data Science Workbench versions, you must run the following script on the CDSW master host every time before you stop or restart Cloudera Data Science Workbench. Failure to do so can result in data loss.
This script should also be run before initiating a Cloudera Data Science Workbench upgrade. As always, we recommend creating a full backup prior to beginning an upgrade. - Available for download at:
#!/bin/bash set -e cat << EXPLANATION This script is a workaround for Cloudera TSB-346. It protects your CDSW projects from a rare race condition that can result in data loss. Run this script before stopping the CDSW service, irrespective of whether the stop precedes a restart, upgrade, or any other task. Run this script only on the master node of your CDSW cluster. You will be asked to specify a target folder on the master node where the script will save a backup of all your project files. Make sure the target folder has enough free space to accommodate all of your project files. To determine how much space is required, run 'du -hs /var/lib/cdsw/current/projects' on the CDSW master node. This script will first back up your project files to the specified target folder. It will then temporarily move your project files aside to protect against the data loss condition. At that point, it is safe to stop the CDSW service. After CDSW has stopped, the script will move the project files back into place. Note: This workaround is not required for CDSW 1.4.2 and higher. EXPLANATION read -p "Enter target folder for backups: " backup_target echo "Backing up to $backup_target..." rsync -azp /var/lib/cdsw/current/projects "$backup_target" read -n 1 -p "Backup complete. Press enter when you are ready to stop CDSW: " echo "Deleting all Kubernetes resources..." kubectl delete configmaps,deployments,daemonsets,replicasets,services,ingress,secrets,persistentvolumes,persistentvolumeclaims,jobs --all kubectl delete pods --all echo "Temporarily saving project files to /var/lib/cdsw/current/projects_tmp..." mkdir /var/lib/cdsw/current/projects_tmp mv /var/lib/cdsw/current/projects/* /var/lib/cdsw/current/projects_tmp echo -e "Please stop the CDSW service." read -n 1 -p "Press enter when CDSW has stopped: " echo "Moving projects back into place..." mv /var/lib/cdsw/current/projects_tmp/* /var/lib/cdsw/current/projects rm -rf /var/lib/cdsw/current/projects_tmp echo -e "Done. You may now upgrade or start the CDSW service." echo -e "When CDSW is running, if desired, you may delete the backup data at $backup_target"
Addressed in release/refresh/patch: This issue is fixed in Cloudera Data Science Workbench 1.4.2.
Note that you are required to run the workaround script above when you upgrade from an affected version to a release with the fix. This helps guard against data loss when the affected version needs to be shut down during the upgrade process.
For the latest update on this issue see the corresponding Knowledge article:
TSB 2018-346: Risk of Data Loss During Cloudera Data Science Workbench (CDSW) Shutdown and Restart
TSB-328: Unauthenticated User Enumeration in Cloudera Data Science Workbench
Unauthenticated users can get a list of user accounts of Cloudera Data Science Workbench.
Products affected: Cloudera Data Science Workbench
Releases affected: Cloudera Data Science Workbench 1.4.0 (and lower)
Users affected: All users of Cloudera Data Science Workbench 1.4.0 (and lower)
Date/time of detection: June 11, 2018
Severity (Low/Medium/High): 5.3 (Medium) CVSS :3.0/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:N/A:N
Impact: Unauthenticated user enumeration in Cloudera Data Science Workbench.
CVE: CVE-2018-15665
Immediate action required: Upgrade to the latest version of Cloudera Data Science Workbench (1.4.2 or higher).
Note that Cloudera Data Science Workbench 1.4.1 is no longer publicly available due to TSB 2018-346: Risk of Data Loss During Cloudera Data Science Workbench (CDSW) Shutdown and Restart.
Addressed in release/refresh/patch: Cloudera Data Science Workbench 1.4.2 (and higher)
For the latest update on this issue see the corresponding Knowledge article:
TSB 2018-318: Unauthenticated User Enumeration in Cloudera Data Science Workbench
Other Notable Fixed Issues in Cloudera Data Science Workbench 1.4.2
Fixed an issue where attempting to fork a large project would result in unexpected 'out of memory' errors.
Cloudera Bug: DSE-4464
Fixed an issue in version 1.4.0 where Cloudera Data Science Workbench workloads would intermittently get stuck in the Scheduling state due to a Red Hat kernel slab leak.
Cloudera Bug: DSE-4098
Fixed an issue in version 1.4.0 where the Hadoop username on non-kerberized clusters defaulted to
. This was a known issue and has been fixed in version 1.4.2. The Hadoop username will now once again default to your Cloudera Data Science Workbench username.Cloudera Bug: DSE-4240
Fixed an issue in version 1.4.0 where creating a project using Git via SSH did not work.
Cloudera Bug: DSE-4278
Fixed an issue in version 1.4.0 where environmental variables set in the Admin panel were not being propagated to projects (experiments, sessions, jobs) as expected.
Cloudera Bug: DSE-4422
Fixed an issue in version 1.4.0 where Cloudera Data Science Workbench would not start when external TLS termination was enabled.
Cloudera Bug: DSE-4640
Fixed an issue in version 1.4.0 where HTTP/HTTPS proxy settings in Cloudera Manager were erroneously escaped when propagated to Cloudera Data Science Workbench engines.
Cloudera Bug: DSE-4421
Fixed an issue in version 1.4.0 where SSH tunnels did not work as expected.
Cloudera Bug: DSE-4741
Fixed an issue in version 1.4.0 where copying multiple files into a folder resulted in unexpected behavior such as overwritten files and incorrect UI messages.
Cloudera Bug: DSE-4831
Fixed an issue in version 1.4.0 where workers (in engines) and collection of usage metrics failed on TLS-enabled clusters.
Cloudera Bug: DSE-4293, DSE-4572
Fixed an issue in version 1.4.0 where the
dialog box did not work.Cloudera Bug: DSE-4807
Fixed an issue in version 1.4.0 where deleting an experiment did not work from certain dashboards. Consequently, deleting the parent project would also fail in such cases.
Cloudera Bug: DSE-4227