Managing Spark Using Cloudera Manager

Spark is available as two services: Spark and Spark (Standalone).

In Cloudera Manager 5.1 and lower, the Spark service runs a Spark Standalone cluster, which has Master and Worker roles.

In Cloudera Manager 5.2 and higher, the service that runs a Spark Standalone cluster has been renamed Spark (Standalone), and the Spark service runs Spark as a YARN application with only gateway roles. Both services have a Spark History Server role.

You can install, add, and start Spark through the Cloudera Manager Installation wizard using parcels. For more information, see Installing Spark.

If you do not add the Spark service using the Installation wizard, you can use the Add Service wizard to create the service. The wizard automatically configures dependent services and the Spark service. For instructions, see Adding a Service.

When you upgrade from Cloudera Manager 5.1 or lower to Cloudera 5.2 or higher, Cloudera Manager does not migrate an existing Spark service, which runs Spark Standalone, to a Spark on YARN service.

For information on Spark applications, see Spark Application Overview.

How Spark Configurations are Propagated to Spark Clients

Because the Spark service does not have worker roles, another mechanism is needed to enable the propagation of client configurations to the other hosts in your cluster. In Cloudera Manager gateway roles fulfill this function. Whether you add a Spark service at installation time or at a later time, ensure that you assign the gateway roles to hosts in the cluster. If you do not have gateway roles, client configurations are not deployed.