Upgrading CSD Deployments from CDH 5 to CDH 6

This section provides a general outline on how to go about upgrading your Cloudera Data Science Workbench cluster from CDH 5 to CDH 6. Refer the relevant linked Cloudera Manager and CDH upgrade documentation for the detailed steps required for this procedure.

Upgrading CSD Deployments from CDH 5 to CDH 6

Starting with version 1.5, Cloudera Data Science Workbench publishes two separate CSD files: one for CDH 5 and one for CDH 6. Check the CSD file name to ensure that you are using the correct CSD file for your cluster. For example:
  • CDH 6 - CLOUDERA_DATA_SCIENCE_WORKBENCH_CDH6_1.x.y.jar

  • CDH 5 - CLOUDERA_DATA_SCIENCE_WORKBENCH_CDH5_1.x.y.jar

Use the following path to upgrade from running a CSD-based Cloudera Data Science Workbench deployment on CDH 5 to running on CDH 6:

  1. Upgrade to Cloudera 6.1 (or higher).
  2. Stop Cloudera Data Science Workbench.
  3. Remove both Spark gateway roles from all CDSW hosts.
  4. Delete the /etc/spark and /etc/spark2 directories.
  5. Download both of the CDSW Cloudera Data Science Workbench CSD files for the latest version. For example:
    • CDSW1.9-CDH6..jar
    • CDSW1.9-CDH5..jar
    At this point, you should have three CSV files. One original file (for example, CDSW1.5-CDH5..jar) and two new files (for example, CDSW1.9-CDH6..jar and CDSW1.9-CDH5..jar).
  6. Log on to the Cloudera Manager Server host, and place the new CDSW files under /opt/cloudera/csd, which is the default location for CSDs.
  7. Restart the Cloudera Manager Server.
  8. Upgrade to Cloudera Data Science Workbench 1.5 (or higher). During the upgrade process, as you install, distribute, and activate the new parcel, take care to ensure that both the CDSW CSDs (for CDH 5 and CDH 6) are present on the Cloudera Manager Server host.
  9. Use the Cloudera Manager Upgrade Wizard to upgrade from CDH 5 to CDH 6.1 (or higher).
    As part of the upgrade, the wizard will also remove the Spark 2 parcel from all your cluster hosts. With CDH 6, Spark 2 ships as a part of CDH. The add-on parcel is no longer required.
    Cloudera Manager 6 can differentiate between the two active CSDs and will select the right one based on the version of CDH running. Because you already have the CDH 6-compatible CSD installed, no further steps are needed.
  10. (Optional) Remove any existing CDH 5 CSDs from the Cloudera Manager Server host.
  11. Add the Spark gateway back in.
  12. Redeploy the client configurations.
  13. Restart Cloudera Data Science Workbench.