Configuring Livy

Login as root, or use root privilege to create user livy. Optionally, if you plan to use Livy with Zeppelin, create user zeppelin.

useradd livy -g hadoop
useradd zeppelin -g hadoop

Create a log directory for Livy:

mkdir /var/log/livy

Change owner of /var/log/livy to livy:hadoop.

chown livy:hadoop /var/log/livy

/etc/livy/livy.conf contains information regarding server configuration.

Create file /etc/livy/livy.conf and add the following to the file:

livy.spark.master=yarn-cluster
livy.environment production
livy.impersonation.enabled true
livy.server.csrf_protection.enabled true
livy.server.port 8998
livy.server.session.timeout 3600000

To enable Livy recovery, add the following three settings to the /etc/livy/livy.conf file:

livy.server.recovery.mode

Specifies Livy recovery mode.

Possible values for this setting are:

off: Default. Turn off recovery. Every time Livy shuts down, it forgets previous sessions.
recovery: Livy persists session info to the state store. When Livy restarts, it recoversprevious sessions from the state store.

livy.server.recovery.state-store

Specifies where Livy stores state, for recovery process.

Possible values for this setting are:

<empty>: Disables state store. This is the default setting.
filesystem: Stores state in a file system.
zookeeper: Stores state in a ZooKeeper instance.

livy.server.recovery.state-store.url

When a filesystem is used for the state store, specifies the path of the state store directory. You can specify any Hadoop-compatible fs system with atomic rename.

When ZooKeeper is used for the state store, specifies the address to the ZooKeeper servers, for example, host1:port1 and host2:port2.

	Important
	Do not use a filesystem that does not support atomic rename (e.g. S3). Examples of filesystems that do not support atomic rename are: `file:///tmp/livy` and `hdfs:///`.

/etc/livy/spark-blacklist.conf defines a list of properties that users are not allowed to override when a Spark session is started.

Create a file called /etc/livy/spark-blacklist.conf and add the following to the file:

# Disallow overriding the master and the deploy mode.
spark.master
spark.submit.deployMode

# Disallow overriding the location of Spark cached jars.
spark.yarn.jar
spark.yarn.jars
spark.yarn.archive

# Don't allow users to override the RSC timeout.
livy.rsc.server.idle_timeout

Create file /etc/livy/livy-env.sh to define the environmental variables. Add the following to the file:

export SPARK_HOME=/usr/hdp/current/spark-client
export LIVY_LOG_DIR=/var/log/livy
export LIVY_PID_DIR=/var/run/livy
export LIVY_SERVER_JAVA_OPTS="-Xmx2g"

If you are not using Kerberos, skip these steps.

If you are using Kerberos, create Livy principals and keytabs. Optionally, if you plan to use Livy with Zeppelin, create Zeppelin principals and keytabs.

kadmin.local -q "addprinc -randkey livy@EXAMPLE.COM"
kadmin.local -q "xst -k /etc/security/keytabs/livy.headless.keytab livy@EXAMPLE.COM"
kadmin.local -q "addprinc -randkey zeppelin@EXAMPLE.COM"
kadmin.local -q "xst -k /etc/security/keytabs/zeppelin.headless.keytab zeppelin@EXAMPLE.COM"

If you are using Kerberos, move the Livy and Zeppelin keytabs to the node on which Livy and Zeppelin will run.
```
chown livy:hadoop /etc/security/keytabs/livy.headless.keytab
chown zeppelin:hadoop /etc/security/keytabs/zeppelin.headless.keytab
```

If you are using Kerberos, add the following to livy.conf:

livy.server.auth.kerberos.keytab /etc/security/keytabs/spnego.service.keytab
livy.server.auth.kerberos.principal HTTP/_HOST@EXAMPLE.COM
livy.server.auth.type kerberos
livy.server.launch.kerberos.keytab /etc/security/keytabs/livy.headless.keytab
livy.server.launch.kerberos.principal livy/_HOST@EXAMPLE.COM

Ensure that the Livy user can read the contents of the /etc/livy/conf directory.

​Configuring Livy