Chapter 4. Running Spark
You can run Spark interactively or from a client program:
Submit interactive statements through the Scala, Python, or R shell, or through a high-level notebook such as Zeppelin.
Use APIs to create a Spark application that runs interactively or in batch mode, using Scala, Python, R, or Java.
To launch Spark applications on a cluster, you can use the spark-submit
script
in the Spark bin
directory. You can also use the API interactively by launching an
interactive shell for Scala (spark-shell
), Python (pyspark
), or
SparkR. Note that each interactive shell automatically creates SparkContext
in a
variable called sc
. For more informationa about spark-submit
, see the
Apache Spark document Submitting
Applications.
Alternately, you can use Livy to submit and manage Spark applications on a cluster. Livy is a Spark service that allows local and remote applications to interact with Apache Spark over an open source REST interface. Livy offers additional multi-tenancy and security functionality. For more information about using Livy to run Spark Applications, see Submitting Spark Applications through Livy.
This chapter describes how to specify Spark version for a Spark application, and how to run Spark 1 and Spark 2 sample programs.