Apache Zeppelin Component Guide
Also available as:
PDF
loading table of contents...

Configuring Livy on an Ambari-Managed Cluster

Livy is a proxy service for Apache Spark; it offers the following capabilities:

  • Zeppelin users can launch a Spark session on a cluster, submit code, and retrieve job results, all over a secure connection.

  • When Zeppelin runs with authentication enabled, Livy propagates user information when a session is created. Livy user impersonation offers an extended multi-tenant experience, allowing users to share RDDs and cluster resources. Multiple users can access their own private data and session, and collaborate on a notebook.

Note: Livy supports Kerberos, but does not require it. Zeppelin to Livy encryption is not supported in the current version of Zeppelin.

The following graphic shows process communication among Zeppelin, Livy, and Spark:

On an Ambari-managed cluster, Livy is installed with Spark.

Here are several optional configuration steps:

  • (Kerberos-enabled clusters) Ensure that access is enabled only for groups and hosts where Livy runs.

  • Check the Livy host URL:

    1. Navigate to the Interpreter configuration page in the Zeppelin Web UI.

    2. In the livy interpreter section, make sure that the zeppelin.livy.url property contains the full Livy host name; replace localhost if necessary.

    3. Scroll down to the "Save" button, and click "Save".

    Note: on an Ambari-managed cluster you can find the Livy host from the Ambari dashboard: navigate to Spark > Summary > Livy Server.

  • Configure Livy impersonation:

    1. From the Ambari dashboard, navigate to Spark > Configs.

    2. Open the "Custom livy-conf" category.

    3. Ensure that livy.superusers is listed; if not, add the property.

    4. Set livy.superusers to the user account associated with Zeppelin, zeppelin.livy.principal.

      For example, if zeppelin.livy.principal is zeppelin-sr1@example.com, set livy.superusers to the same account, zeppelin-sr1@example.com.

  • Specify a timeout value for Livy sessions.

    By default, Livy preserves cluster resources by recycling sessions after one hour of session inactivity. When a Livy session times out, the Livy interpreter must be restarted.

    To specify a larger or smaller value using Ambari, navigate to the livy.server.session.timeout property in the "Advanced livy-conf" section of the Spark service. Specify the timeout in milliseconds (the default is 3600000, one hour):

If you change any Livy interpreter settings, restart the Livy interpreter.

Navigate to the Interpreter configuration page in the Zeppelin Web UI. Locate the Livy interpreter and click "restart":

To verify that the Livy server is running, access the Livy web UI in a browser window. The default port is 8998:

http://<livy-hostname>:8998/