Configuring Livy on an Ambari-Managed Cluster
Livy is a proxy service for Apache Spark; it offers the following capabilities:
Zeppelin users can launch a Spark session on a cluster, submit code, and retrieve job results, all over a secure connection.
When Zeppelin runs with authentication enabled, Livy propagates user information when a session is created. Livy user impersonation offers an extended multi-tenant experience, allowing users to share RDDs and cluster resources. Multiple users can access their own private data and session, and collaborate on a notebook.
Note: Livy supports Kerberos, but does not require it.
The following graphic shows process communication among Zeppelin, Livy, and Spark:
On an Ambari-managed cluster, Livy is installed with Spark.
Here are several optional configuration steps:
(Kerberos-enabled clusters) Ensure that access is enabled only for groups and hosts where Livy runs.
Check the Livy host URL:
Navigate to the Interpreter configuration page in the Zeppelin Web UI.
In the livy interpreter section, make sure that the
zeppelin.livy.url
property contains the full Livy host name; replacelocalhost
if necessary.Scroll down to the "Save" button, and click "Save".
Note: on an Ambari-managed cluster you can find the Livy host from the Ambari dashboard: navigate to Spark > Summary > Livy Server.
Configure Livy impersonation:
From the Ambari dashboard, navigate to Spark > Configs.
Open the "Custom livy-conf" category.
Ensure that
livy.superusers
is listed; if not, add the property.Set
livy.superusers
to the user account associated with Zeppelin,zeppelin.livy.principal
.For example, if
zeppelin.livy.principal
iszeppelin-sr1@example.com
, setlivy.superusers
to the same account,zeppelin-sr1@example.com
.
Specify a timeout value for Livy sessions.
By default, Livy preserves cluster resources by recycling sessions after one hour of session inactivity. When a Livy session times out, the Livy interpreter must be restarted.
To specify a larger or smaller value using Ambari, navigate to the
livy.server.session.timeout
property in the "Advanced livy-conf" section of the Spark service. Specify the timeout in milliseconds (the default is 3600000, one hour):
If you change any Livy interpreter settings, restart the Livy interpreter.
Navigate to the Interpreter configuration page in the Zeppelin Web UI. Locate the Livy interpreter and click "restart":
To verify that the Livy server is running, access the Livy web UI in a browser window. The default port is 8998:
http://<livy-hostname>:8998/