Run the Spark Pi example
The Pi program tests compute-intensive tasks by calculating pi using an approximation method. The program “throws darts” at a circle -- it generates points in the unit square ((0,0) to (1,1)) and sees how many fall within the unit circle. The result approximates pi.
To run Spark Pi:
Log on as a user with HDFS access--for example, your
spark
user (if you defined one) orhdfs
. Navigate to a node with a Spark client and access thespark-client
directory:su hdfs
cd /usr/hdp/current/spark-client
Submit the Spark Pi job:
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 lib/spark-examples*.jar 10
The job should complete without errors. It should produce output similar to the following:
15/04/10 17:29:35 INFO Client: client token: N/A diagnostics: N/A ApplicationMaster host: N/A ApplicationMaster RPC port: 0 queue: default start time: 1428686924325 final status: SUCCEEDED tracking URL: http://blue1:8088/proxy/application_1428670545834_0009/ user: hdfs
To view job status in a browser, copy the URL tracking from the job output and go to the associated URL.
Job output should list the estimated value of pi. In the following example, output was directed to stdout:
Log Type: stdout Log Upload Time: 22-Mar-2015 17:13:33 Log Length: 23 Pi is roughly 3.142532