Chapter 5. Submitting Spark Applications Through Livy
Livy is a Spark service that allows local and remote applications to interact with Apache Spark over an open source REST interface. You can use Livy to submit and manage Spark jobs on a cluster. Livy extends Spark capabilities, offering additional multi-tenancy and security features. Applications can run code inside Spark without needing to maintain a local Spark context.
Features include the following:
Jobs can be submitted from anywhere, using the REST API.
Livy supports user impersonation: the Livy server submits jobs on behalf of the user who submits the requests. Multiple users can share the same server ("user impersonation" support). This is important for multi-tenant environments, and it avoids unnecessary permission escalation.
Livy supports security features such as Kerberos authentication and wire encryption.
REST APIs are backed by SPNEGO authentication, which the requested user should get authenticated by Kerberos at first.
RPCs between Livy Server and Remote SparkContext are encrypted with SASL.
The Livy server uses keytabs to authenticate itself to Kerberos.
Livy 0.3.0 supports programmatic and interactive access to Spark1 and Spark2 with Scala 2.10, and Scala 2.11:
Use an interactive notebook to access Spark through Livy.
Develop a Scala, Java, or Python client that uses the Livy API. The Livy REST API supports full Spark 1 and Spark 2 functionality including SparkSession, and SparkSession with Hive enabled.
Run an interactive session, provided by spark-shell, PySpark, or SparkR REPLs.
Submit batch applications to Spark.
Code runs in a Spark context, either locally or in YARN; YARN cluster mode is recommended.
To install Livy on an Ambari-managed cluster, see Installing Spark Using Ambari. To install Livy on a cluster not managed by Ambari, see the Spark sections of the Command Line Installation Guide. For additional configuration steps, see Configuring the Livy Server.
Note: Spark does not currently support encrypted REST APIs for Livy.