Introduction
You can run Spark interactively or from a client program:
-
Submit interactive statements through the Scala, Python, or R shell, or through a high-level notebook such as Zeppelin.
-
Use APIs to create a Spark application that runs interactively or in batch mode, using Scala, Python, R, or Java.
To launch Spark applications on a cluster, you can use the spark-submit
script in the Spark bin
directory. You can also use the API interactively
by launching an interactive shell for Scala (spark-shell
), Python
(pyspark
), or SparkR. Note that each interactive shell automatically
creates SparkContext
in a variable called sc
. For more
informationa about spark-submit
, see the Apache Spark document "Submitting
Applications".
Alternately, you can use Livy to submit and manage Spark applications on a cluster. Livy is a Spark service that allows local and remote applications to interact with Apache Spark over an open source REST interface. Livy offers additional multi-tenancy and security functionality. For more information about using Livy to run Spark Applications, see "Submitting Spark Applications through Livy" in this guide.