Apache Zeppelin Component Guide
Also available as:
PDF
loading table of contents...

Configuring Livy on an Ambari-Managed Cluster

Livy is a proxy service for Apache Spark; it offers the following capabilities:

  • Zeppelin users can launch a Spark session on a cluster, submit code, and retrieve job results, all over a secure connection.

  • When Zeppelin runs with authentication enabled, Livy propagates user information when a session is created. Livy user impersonation offers an extended multi-tenant experience, allowing users to share RDDs and cluster resources. Multiple users can access their own private data and session, and collaborate on a notebook.

Note: Livy supports Kerberos, but does not require it.

The following graphic shows process communication among Zeppelin, Livy, and Spark:

On an Ambari-managed cluster, Livy is installed with Spark.

Here are several optional configuration steps:

  • (Kerberos-enabled clusters) Ensure that access is enabled only for groups and hosts where Livy runs.

  • Check the Livy host URL:

    1. Navigate to the Interpreter configuration page in the Zeppelin Web UI.

    2. In the livy interpreter section, make sure that the zeppelin.livy.url property contains the full Livy host name; replace localhost if necessary.

    3. Scroll down to the "Save" button, and click "Save".

    Note: on an Ambari-managed cluster you can find the Livy host from the Ambari dashboard: navigate to Spark > Summary > Livy Server.

  • Configure Livy impersonation:

    1. From the Ambari dashboard, navigate to Spark > Configs.

    2. Open the "Custom livy-conf" category.

    3. Ensure that livy.superusers is listed; if not, add the property.

    4. Set livy.superusers to the user account associated with Zeppelin, zeppelin.livy.principal.

      For example, if zeppelin.livy.principal is zeppelin-sr1@example.com, set livy.superusers to the same account, zeppelin-sr1@example.com.

  • Specify a timeout value for Livy sessions.

    By default, Livy preserves cluster resources by recycling sessions after one hour of session inactivity. When a Livy session times out, the Livy interpreter must be restarted.

    To specify a larger or smaller value using Ambari, navigate to the livy.server.session.timeout property in the "Advanced livy-conf" section of the Spark service. Specify the timeout in milliseconds (the default is 3600000, one hour):

If you change any Livy interpreter settings, restart the Livy interpreter.

Navigate to the Interpreter configuration page in the Zeppelin Web UI. Locate the Livy interpreter and click "restart":

To verify that the Livy server is running, access the Livy web UI in a browser window. The default port is 8998:

http://<livy-hostname>:8998/