Installing CDS 3.0 Powered by Apache Spark
CDS 3.0 Powered by Apache Spark is distributed as two files: a custom service descriptor file and a parcel, both of which must be installed on the cluster.
Install CDS Powered by Apache Spark
Follow these steps to install CDS 3 Powered by Apache Spark:
- Check that all the software prerequisites are satisfied. If not, you might need to upgrade or install other software components first.
-
Install the CDS Powered by Apache Spark service descriptor into
Cloudera Manager.
- To download the CDS Powered by Apache Spark service descriptor, click the service descriptor link for the version you want to install.
- Log on to the Cloudera Manager Server host, and copy the CDS Powered by Apache Spark service descriptor in the location configured for service descriptor files.
- Set the file ownership of the service descriptor to cloudera-scm:cloudera-scm with permission 644.
-
Restart the Cloudera Manager Server with the following command:
systemctl restart cloudera-scm-server
- In the Cloudera Manager Admin Console, add the CDS parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
- Download the CDS Powered by Apache Spark parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
-
Add the Spark 3 service to your cluster.
- In step 1, select any optional dependencies, such as HBase and Hive, or select No Optional Dependencies.
- In step 2, when customizing the role assignments, add a gateway role to every host.
- On the Review Changes page, you can enable TLS for the Spark History Server.
- Note that the History Server port is 18089 instead of the usual 18088.
- Complete the remaining steps in the wizard.
- Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
- Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.
Install the Livy for Spark 3 Service Descriptor
CDS 3 supports Apache Livy, but it cannot use the included Livy service, which is compatible with only Spark 2. To add and manage a Livy service compatible with Spark 3, you must install a service descriptor for the Livy for Spark 3 service.
- Install the Livy for Spark 3 service descriptor into Cloudera
Manager.
- To download the service descriptor, click the service descriptor link for the version you want to install.
- Log on to the Cloudera Manager Server host, and copy the Livy service descriptor to the location configured for service descriptor files.
- Set the file ownership of the service descriptor to cloudera-scm:cloudera-scm with permissions set to 644.
- Restart the Cloudera Manager Server with the following command:
systemctl restart cloudera-scm-server
- In the Cloudera Manager Admin Console, add the CDS 3 parcel repository to the Remote Parcel Repository URLs in Parcel Settings as described in Parcel Configuration Settings.
- Download the CDS Powered by Apache Spark parcel, distribute the parcel to the hosts in your cluster, and activate the parcel. For instructions, see Managing Parcels.
- Add the Livy for Spark 3 service to your cluster.
- Note that the Livy port is 28998 instead of the usual 8998.
- Complete the remaining steps in the wizard.
- Return to the Home page by clicking the Cloudera Manager logo in the upper left corner.
- Click the stale configuration icon to launch the Stale Configuration wizard and restart the necessary services.