Installing CDS 3.3 Powered by Apache Spark

CDS 3.3 Powered by Apache Spark (CDS 3.3) is distributed as a parcel. Currently there are no client Spark3 RPM nor DEB packages. There are no external Custom Service Descriptors (CSD) for Livy for Spark3 or Spark3 using CDS 3.3 for CDP-7.1.9 parcel because they are already part of Cloudera Manager 7.11.3.

Install CDS 3.3 Powered by Apache Spark

Follow these steps to install CDS 3.3:

  1. Check that all the software prerequisites are satisfied. If not, you might need to upgrade or install other software components first.
  2. In the Cloudera Manager Admin Console, add the CDS parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
  3. Download the CDS 3.3 parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
  4. Add the Spark 3 service to your cluster.
    1. In step 1, select any optional dependencies, such as HBase and Hive, or select No Optional Dependencies.
    2. In step 2, when customizing the role assignments, add a gateway role to every host.
    3. On the Review Changes page, you can enable TLS for the Spark History Server.
    4. Note that the History Server port is 18089 instead of the usual 18088.
    5. Complete the remaining steps in the wizard.
  5. Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
  6. Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.

Install the Livy for Spark 3 Service

CDS 3 supports Apache Livy, but it cannot use the included Livy service, which is compatible with only Spark 2. To add and manage a Livy service compatible with Spark 3, you must install the Livy for Spark 3 service.

  1. In the Cloudera Manager Admin Console, add the CDS 3 parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
  2. Download the CDS 3.3 parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
  3. Add the Livy for Spark 3 service to your cluster.
    1. Note that the Livy port is 28998 instead of the usual 8998.
    2. Complete the remaining steps in the wizard.
  4. Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
  5. Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.

If you want to activate the CDS 3.3 with GPU Support feature, Set up a Yarn role group to enable GPU usage and optionally Configure NVIDIA RAPIDS Shuffle Manager