This section provides a general outline on how to go about upgrading your Cloudera
Data Science Workbench cluster from CDH 5 to CDH 6. Refer the relevant linked Cloudera Manager
and CDH upgrade documentation for the detailed steps required for this procedure.
Upgrading CSD Deployments from CDH 5 to CDH 6
Starting with version 1.5, Cloudera Data Science Workbench publishes two separate CSD
files: one for CDH 5 and one for CDH 6. Check the CSD file name to ensure that you are
using the correct CSD file for your cluster. For example:
Use the following path to upgrade from running a CSD-based Cloudera Data Science
Workbench deployment on CDH 5 to running on CDH 6:
-
Upgrade to Cloudera 6.1 (or higher).
-
Stop Cloudera Data Science Workbench.
-
Remove both Spark gateway roles from all CDSW hosts.
-
Delete the
/etc/spark
and /etc/spark2
directories.
-
Download both of the CDSW Cloudera Data Science Workbench CSD files for the
latest version. For example:
- CDSW1.8-CDH6..jar
- CDSW1.8-CDH5..jar
At this point, you should have three CSV files. One original file (for example,
CDSW1.5-CDH5..jar) and two new files (for example, CDSW1.8-CDH6..jar and
CDSW1.8-CDH5..jar).
-
Log on to the Cloudera Manager Server host, and place the new CDSW files under
/opt/cloudera/csd
, which is the default location for CSDs.
-
Restart the Cloudera Manager Server.
-
Upgrade to Cloudera Data Science Workbench 1.5 (or
higher). During the upgrade process, as you install, distribute, and
activate the new parcel, take care to ensure that both the CDSW CSDs (for CDH 5 and
CDH 6) are present on the Cloudera Manager Server host.
-
Use the Cloudera Manager Upgrade Wizard to upgrade
from CDH 5 to CDH 6.1 (or higher).
As part of the upgrade, the wizard will also remove the Spark 2 parcel from all
your cluster hosts. With CDH 6, Spark 2 ships as a part of CDH. The add-on parcel is
no longer required.
Cloudera Manager 6 can differentiate between the two active CSDs and will select
the right one based on the version of CDH running. Because you already have the CDH
6-compatible CSD installed, no further steps are needed.
-
(Optional) Remove any existing CDH 5 CSDs from the Cloudera Manager Server
host.
-
Add the Spark gateway back in.
-
Redeploy the client configurations.
-
Restart Cloudera Data Science Workbench.