Example: Running SparkPi on YARN
These examples demonstrate how to use
spark-submit
to submit the SparkPi Spark example
application with various options. In the examples, the argument passed after
the JAR controls how close to pi the approximation should be.
In a CDP deployment, SPARK_HOME defaults to
/opt/cloudera/parcels/CDH/lib/spark
. The shells are
also available from /bin
.
Running SparkPi in YARN Cluster Mode
To run SparkPi
in cluster mode:
spark-submit --class org.apache.spark.examples.SparkPi --master yarn \ --deploy-mode cluster /opt/cloudera/parcels/CDH/jars/spark-examples*.jar 10
The command prints status until the job finishes or you press
control-C. Terminating the
spark-submit
process in cluster mode does not
terminate the Spark application as it does in client mode. To monitor
the status of the running application, run yarn application
-list
.
Running SparkPi in YARN Client Mode
To run SparkPi
in client mode:
spark-submit --class org.apache.spark.examples.SparkPi --master yarn \ --deploy-mode client SPARK_HOME/lib/spark-examples.jar 10
Running Python SparkPi in YARN Cluster Mode
- Unpack the Python examples archive:
sudo su gunzip SPARK_HOME/lib/python.tar.gz sudo su tar xvf SPARK_HOME/lib/python.tar
- Run the
pi.py
file:spark-submit --master yarn --deploy-mode cluster SPARK_HOME/lib/pi.py 10