Configuring Livy
Perform the following steps to configure Livy:
Login as
root
, or use root privilege to create userlivy
. Optionally, if you plan to use Livy with Zeppelin, create userzeppelin
.useradd livy -g hadoop useradd zeppelin -g hadoop
Create a log directory for Livy:
mkdir /var/log/livy2
Change owner of
/var/log/livy2
tolivy:hadoop
.chown livy:hadoop /var/log/livy2
/etc/livy2/livy.conf
contains information regarding server configuration.Create file
/etc/livy2/livy.conf
and add the following to the file:livy.environment production livy.impersonation.enabled true livy.server.csrf_protection.enabled true livy.server.port 8998 livy.server.session.timeout 3600000
To enable Livy recovery, add the following three settings to the
/etc/livy2/livy.conf
file:- livy.server.recovery.mode
Specifies Livy recovery mode.
Possible values for this setting are:
- off
Default. Turn off recovery. Every time Livy shuts down, it forgets previous sessions.
- recovery
Livy persists session info to the state store. When Livy restarts, it recoversprevious sessions from the state store.
- livy.server.recovery.state-store
Specifies where Livy stores state, for recovery process.
Possible values for this setting are:
- <empty>
Disables state store. This is the default setting.
- filesystem
Stores state in a file system.
- zookeeper
Stores state in a ZooKeeper instance.
- livy.server.recovery.state-store.url
When a filesystem is used for the state store, specifies the path of the state store directory. You can specify any Hadoop-compatible fs system with atomic rename.
When ZooKeeper is used for the state store, specifies the address to the ZooKeeper servers, for example,
host1:port1
andhost2:port2
.Important Do not use a filesystem that does not support atomic rename (e.g. S3). Examples of filesystems that do not support atomic rename are:
file:///tmp/livy
andhdfs:///
.
/etc/livy2/spark-blacklist.conf
defines a list of properties that users are not allowed to override when a Spark 2 session is started.Create a file called
/etc/livy/spark-blacklist.conf
and add the following to the file:# Disallow overriding the master and the deploy mode. spark.master spark.submit.deployMode # Disallow overriding the location of Spark cached jars. spark.yarn.jar spark.yarn.jars spark.yarn.archive # Don't allow users to override the RSC timeout. livy.rsc.server.idle_timeout
Create file
/etc/livy2/livy-env.sh
to define the environmental variables. Add the following to the file:livy.spark.master=yarn-cluster export SPARK_HOME=/usr/hdp/current/spark2-client export LIVY_LOG_DIR=/var/log/livy2 export LIVY_PID_DIR=/var/run/livy2 export LIVY_SERVER_JAVA_OPTS="-Xmx2g"
If you are not using Kerberos, skip these steps.
If you are using Kerberos, create Livy principals and keytabs. Optionally, if you plan to use Livy with Zeppelin, create Zeppelin principals and keytabs.
kadmin.local -q "addprinc -randkey livy@EXAMPLE.COM" kadmin.local -q "xst -k /etc/security/keytabs/livy.headless.keytab livy@EXAMPLE.COM" kadmin.local -q "addprinc -randkey zeppelin@EXAMPLE.COM" kadmin.local -q "xst -k /etc/security/keytabs/zeppelin.headless.keytab zeppelin@EXAMPLE.COM"
If you are using Kerberos, move the Livy and Zeppelin keytabs to the node on which Livy and Zeppelin will run.
chown livy:hadoop /etc/security/keytabs/livy.headless.keytab chown zeppelin:hadoop /etc/security/keytabs/zeppelin.headless.keytab
If you are using Kerberos, add the following to
livy.conf
:livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab livy.server.auth.kerberos.principal HTTP/_HOST@EXAMPLE.COM livy.server.auth.type kerberos livy.server.launch.kerberos.keytab /etc/security/keytabs/livy.headless.keytab livy.server.launch.kerberos.principal livy/_HOST@EXAMPLE.COM
Ensure that the Livy user can read the contents of the
/etc/livy2/conf
directory.