Upgrading to the Latest Version of Cloudera Data Science Workbench 1.6.x on CDH
This topic walks you through the upgrade paths available for Cloudera Data Science Workbench 1.6.x. Depending on your existing deployment, choose from one of the upgrade paths listed in the following table.
Upgrade Path | Link to Instructions |
---|---|
Upgrading from an existing CSD-based deployment to the latest 1.6.x CSD and parcel. |
Upgrading Cloudera Data Science Workbench 1.6.x Using Cloudera Manager |
Migrating from an RPM-based deployment to the latest 1.6.x CSD and parcel-based deployment. |
Migrating from an RPM-based Deployment to the Latest 1.6.x CSD |
Upgrading an existing RPM-based deployment to the latest 1.6.x RPM. |
Upgrading Cloudera Data Science Workbench 1.6.x Using Packages
Note that you cannot use Cloudera Manager for this upgrade path. |
Upgrading Cloudera Data Science Workbench from CDH 5 to CDH 6
This section provides a general outline on how to go about upgrading your Cloudera Data Science Workbench cluster from CDH 5 to CDH 6. Refer the relevant linked Cloudera Manager and CDH upgrade documentation for the detailed steps required for this procedure.
Upgrading CSD Deployments from CDH 5 to CDH 6
-
CDH 6 - CLOUDERA_DATA_SCIENCE_WORKBENCH_CDH6_1.x.y.jar
-
CDH 5 - CLOUDERA_DATA_SCIENCE_WORKBENCH_CDH5_1.x.y.jar
Use the following path to upgrade from running a CSD-based Cloudera Data Science Workbench deployment on CDH 5 to running on CDH 6:
-
Stop Cloudera Data Science Workbench.
- Remove both Spark gateway roles from all CDSW hosts.
- Delete the /etc/spark and /etc/spark2 directories.
-
Download both of the CDSW Cloudera Data Science Workbench CSD files for the latest version. For example:
- CDSW1.8-CDH6..jar
- CDSW1.8-CDH5..jar
At this point, you should have three CSV files. One original file (for example, CDSW1.5-CDH5..jar) and two new files (for example, CDSW1.8-CDH6..jar and CDSW1.8-CDH5..jar).
- Log on to the Cloudera Manager Server host, and place the new CDSW files under /opt/cloudera/csd, which is the default location for CSDs.
-
Restart the Cloudera Manager Server.
- Upgrade to Cloudera Data Science Workbench 1.5 (or higher). During the upgrade process, as you install, distribute, and activate the new parcel, take care to ensure that both the CDSW CSDs (for CDH 5 and CDH 6) are present on the Cloudera Manager Server host.
-
Use the Cloudera Manager Upgrade Wizard to upgrade from CDH 5 to CDH 6.1 (or higher). As part of the upgrade, the wizard will also remove the Spark 2 parcel from all your cluster hosts. With CDH 6, Spark 2 ships as a part of CDH. The add-on parcel is no longer required.
Cloudera Manager 6 can differentiate between the two active CSDs and will select the right one based on the version of CDH running. Because you already have the CDH 6-compatible CSD installed, no further steps are needed.
- (Optional) Remove any existing CDH 5 CSDs from the Cloudera Manager Server host.
- Add the Spark gateway back in.
- Redeploy the client configurations.
- Restart Cloudera Data Science Workbench.
Upgrading RPM Deployments from CDH 5 to CDH 6
Cloudera Data Science Workbench ships a single RPM that can be used to install CDSW on both, CDH 5, and CDH 6 clusters. The upgrade path for RPM deployments is:
-
Use the Cloudera Manager Upgrade Wizard to upgrade from CDH 5 to CDH 6 (or higher). As part of the upgrade, the wizard will also remove the Spark 2 parcel from all your cluster hosts. This is because with CDH 6, Spark 2 ships as a part of CDH. The add-on parcel is no longer required.