Apache Spark Component Guide
Also available as:
PDF
loading table of contents...

Chapter 5. Submitting Spark Applications Through Livy

Livy is a Spark service that allows local and remote applications to interact with Apache Spark over an open source REST interface. You can use Livy to submit and manage Spark jobs on a cluster. Livy extends Spark capabilities, offering additional multi-tenancy and security features. Applications can run code inside Spark without needing to maintain a local Spark context.

Features include the following:

  • Jobs can be submitted from anywhere, using the REST API.

  • Livy supports user impersonation: the Livy server submits jobs on behalf of the user who submits the requests. Multiple users can share the same server ("user impersonation" support). This is important for multi-tenant environments, and it avoids unnecessary permission escalation.

  • Livy supports security features such as Kerberos authentication and wire encryption.

    • REST APIs are backed by SPNEGO authentication, which the requested user should get authenticated by Kerberos at first.

    • RPCs between Livy Server and Remote SparkContext are encrypted with SASL.

    • The Livy server uses keytabs to authenticate itself to Kerberos.

Livy 0.3.0 supports programmatic and interactive access to Spark1 and Spark2 with Scala 2.10, and Scala 2.11:

  • Use an interactive notebook to access Spark through Livy.

  • Develop a Scala, Java, or Python client that uses the Livy API. The Livy REST API supports full Spark 1 and Spark 2 functionality including SparkSession, and SparkSession with Hive enabled.

  • Run an interactive session, provided by spark-shell, PySpark, or SparkR REPLs.

  • Submit batch applications to Spark.

Code runs in a Spark context, either locally or in YARN; YARN cluster mode is recommended.

To install Livy on an Ambari-managed cluster, see Installing Spark Using Ambari. To install Livy on a cluster not managed by Ambari, see the Spark sections of the Command Line Installation Guide. For additional configuration steps, see Configuring the Livy Server.

Note: Spark does not currently support encrypted REST APIs for Livy.