Installing CDS 3.3 Powered by Apache Spark

CDS 3.3 Powered by Apache Spark (CDS 3.3) is distributed as a parcel. The custom service descriptor (CSD) file for CDS 3.3 is available in Cloudera Manager for CDP 7.1.8, both of which must be installed on the cluster.

Install CDS 3.3 Powered by Apache Spark

Follow these steps to install CDS 3.3:

  1. Check that all the software prerequisites are satisfied. If not, you might need to upgrade or install other software components first.
  2. In the Cloudera Manager Admin Console, add the CDS parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
  3. Download the CDS 3.3 parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
  4. Add the Spark 3 service to your cluster.
    1. In step 1, select any optional dependencies, such as HBase and Hive, or select No Optional Dependencies.
    2. In step 2, when customizing the role assignments, add a gateway role to every host.
    3. On the Review Changes page, you can enable TLS for the Spark History Server.
    4. Note that the History Server port is 18089 instead of the usual 18088.
    5. Complete the remaining steps in the wizard.
  5. Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
  6. Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.

Install the Livy for Spark 3 Service Descriptor

CDS 3 supports Apache Livy, but it cannot use the included Livy service, which is compatible with only Spark 2. To add and manage a Livy service compatible with Spark 3, you must install a service descriptor for the Livy for Spark 3 service.

  1. In the Cloudera Manager Admin Console, add the CDS 3 parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
  2. Download the CDS 3.3 parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
  3. Add the Livy for Spark 3 service to your cluster.
    1. Note that the Livy port is 28998 instead of the usual 8998.
    2. Complete the remaining steps in the wizard.
  4. Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
  5. Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.

If you want to activate the CDS 3.3 with GPU Support feature, Set up a Yarn role group to enable GPU usage and optionally Configure NVIDIA RAPIDS Shuffle Manager