Apache Zeppelin Component Guide
Also available as:
PDF

Configuring Livy on an Ambari-Managed Cluster

Livy is a proxy service for Apache Spark. On HDP 2.5, Livy offers two main capabilities:

  • Zeppelin users can launch a Spark session on a cluster, submit code, and retrieve job results, all over a secure connection.

  • When Zeppelin runs with authentication enabled, Livy propagates user information when a session is created. Livy user impersonation offers an extended multi-tenant experience, allowing users to share RDDs and cluster resources. Multiple users can access their own private data and session, and collaborate on a notebook.

[Note]Note

Livy can be accessed through the %livy interpreter in Zeppelin notebooks, but direct use of Livy REST APIs is not supported.

Livy supports Kerberos, but does not require it.

The following graphic shows process communication among Zeppelin, Livy, and Spark:

On an Ambari-managed cluster, Livy is available as an option when you install Spark. After installing Livy, complete the following configuration steps.

  1. (Optional, for Kerberos-enabled clusters) Ensure that access is enabled only for groups and hosts where Livy runs.

  2. Specify the Livy host URL:

    1. Navigate to the Interpreter configuration page in the Zeppelin Web UI.

    2. In the livy interpreter section, make sure that the zeppelin.livy.url property contains the full Livy host name; replace localhost if necessary.

    3. Scroll down to the "Save" button, and click "Save".

    Note: on an Ambari-managed cluster you can find the Livy host from the Ambari dashboard: navigate to Spark > Summary > Livy Server.

  3. (Optional) Configure Livy impersonation:

    1. From the Ambari dashboard, navigate to Spark > Configs.

    2. Open the "Custom livy-conf" category.

    3. Ensure that livy.superusers is listed; if not, add the property.

    4. Set livy.superusers to the user account associated with Zeppelin, zeppelin.livy.principal.

      For example, if zeppelin.livy.principal is zeppelin-sr1@example.com, set livy.superusers to the same account, zeppelin-sr1@example.com.

  4. (Optional) Specify a timeout value for Livy sessions.

    By default, Livy preserves cluster resources by recycling sessions after one hour of session inactivity. When a Livy session times out, the Livy interpreter must be restarted.

    To specify a larger or smaller value using Ambari, navigate to the livy.server.session.timeout property in the "Advanced livy-conf" section of the Spark service. Specify the timeout in milliseconds (the default, one hour, is 3600000):

  5. If you changed any Livy interpreter settings, restart the Livy interpreter.

    Navigate to the Interpreter configuration page in the Zeppelin Web UI. Locate the Livy interpreter and click "restart":

To verify that the Livy server is running, access the Livy web UI in a browser window. The default port is 8998:

http://<livy-hostname>:8998/