Configuring Livy on an Ambari-Managed Cluster
Livy is a proxy service for Apache Spark. On HDP 2.5, Livy offers two main capabilities:
Zeppelin users can launch a Spark session on a cluster, submit code, and retrieve job results, all over a secure connection.
When Zeppelin runs with authentication enabled, Livy propagates user information when a session is created. Livy user impersonation offers an extended multi-tenant experience, allowing users to share RDDs and cluster resources. Multiple users can access their own private data and session, and collaborate on a notebook.
Note | |
---|---|
Livy can be accessed through the Livy supports Kerberos, but does not require it. |
The following graphic shows process communication among Zeppelin, Livy, and Spark:
On an Ambari-managed cluster, Livy is available as an option when you install Spark. After installing Livy, complete the following configuration steps.
(Optional, for Kerberos-enabled clusters) Ensure that access is enabled only for groups and hosts where Livy runs.
Specify the Livy host URL:
Navigate to the Interpreter configuration page in the Zeppelin Web UI.
In the livy interpreter section, make sure that the
zeppelin.livy.url
property contains the full Livy host name; replacelocalhost
if necessary.Scroll down to the "Save" button, and click "Save".
Note: on an Ambari-managed cluster you can find the Livy host from the Ambari dashboard: navigate to Spark > Summary > Livy Server.
(Optional) Configure Livy impersonation:
From the Ambari dashboard, navigate to Spark > Configs.
Open the "Custom livy-conf" category.
Ensure that
livy.superusers
is listed; if not, add the property.Set
livy.superusers
to the user account associated with Zeppelin,zeppelin.livy.principal
.For example, if
zeppelin.livy.principal
iszeppelin-sr1@example.com
, setlivy.superusers
to the same account,zeppelin-sr1@example.com
.
(Optional) Specify a timeout value for Livy sessions.
By default, Livy preserves cluster resources by recycling sessions after one hour of session inactivity. When a Livy session times out, the Livy interpreter must be restarted.
To specify a larger or smaller value using Ambari, navigate to the livy.server.session.timeout property in the "Advanced livy-conf" section of the Spark service. Specify the timeout in milliseconds (the default, one hour, is 3600000):
If you changed any Livy interpreter settings, restart the Livy interpreter.
Navigate to the Interpreter configuration page in the Zeppelin Web UI. Locate the Livy interpreter and click "restart":
To verify that the Livy server is running, access the Livy web UI in a browser window. The default port is 8998:
http://<livy-hostname>:8998/