Chapter 7. Enabling HDFS Storage for Zeppelin Notebooks and Configuration in HDP-2.6.3+
Overview
HDP-2.6.3 introduced support for HDFS storage for Apache Zeppelin notebooks and configuration files. In previous versions, notebooks and configuration files were stored on the local disk of the Zeppelin server.
When upgrading to HDP-2.6.3 and higher versions, there are two options for configuring Zeppelin notebook and configuration file storage:
Use HDFS storage (recommended) – Zeppelin notebooks and configuration files must be copied to the new HDFS storage location before upgrading. Additional upgrade and post-upgrade steps must also be performed, as described in the following section.
Use local storage – Perform upgrade and post-upgrade steps to enable local storage.
Enabling HDFS storage makes future upgrades much easier, and also sets up the first step toward enabling Zeppelin High Availability. Therefore it is recommended that you enable HDFS for Zeppelin notebooks and configuration files when upgrading to HDP 2.6.3+ from earlier versions of HDP.
Note | |
---|---|
Currently HDFS and local storage are the only supported notebook storage mechanisms in HDP-2.6.3+. Currently VFSNotebookRepo is the only supported local storage option. |
Enable HDFS Storage when Upgrading to HDP-2.6.3+
Perform the following steps to enable HDFS storage when upgrading to HDP 2.6.3+ from earlier versions of HDP.
Before upgrading Zeppelin, perform the following steps as the Zeppelin service user.
Create the
/user/zeppelin/conf
and/user/zeppelin/notebook
directories in HDFS.hdfs dfs -ls /user/zeppelin drwxr-xr-x - zeppelin hdfs 0 2018-01-20 04:17 /user/zeppelin/conf drwxr-xr-x - zeppelin hdfs 0 2018-01-20 03:40 /user/zeppelin/notebook
Copy all notebooks from the local Zeppelin server (for example,
/usr/hdp/2.5.3.0-37/zeppelin/notebook/
) to the/user/zeppelin/notebook
directory in HDFS.hdfs dfs -ls /user/zeppelin/notebook drwxr-xr-x - zeppelin hdfs 0 2018-01-19 01:40 /user/zeppelin/notebook/2A94M5J1Z drwxr-xr-x - zeppelin hdfs 0 2018-01-19 01:40 /user/zeppelin/notebook/2BWJFTXKJ
Copy the
interpreter.json
andnotebook-authorization.json
files from the local Zeppelin service configuration directory (/etc/zeppelin/conf
) to the/user/zeppelin/conf
directory in HDFS.hdfs dfs -ls /user/zeppelin/conf -rw-r--r-- 3 zeppelin hdfs 284091 2018-01-22 23:28 /user/zeppelin/conf/interpreter.json -rw-r--r-- 3 zeppelin hdfs 123849 2018-01-22 23:29 /user/zeppelin/conf/notebook-authorization.json
Upgrade Ambari.
Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin.
zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.FileSystemNotebookRepo zeppelin.config.fs.dir = conf
If necessary, add or update these configuration settings as shown above.
After the upgrade is complete:
Log on to the Zeppelin server and verify that the following properties exist in the
/etc/zeppelin/conf/zeppelin-site.xml
file. The actual value for the keytab file and principal name may be different for your cluster.<property> <name>zeppelin.server.kerberos.keytab</name> <value>/etc/security/keytabs/zeppelin.server.kerberos.keytab</value> </property> <property> <name>zeppelin.server.kerberos.principal</name> <value>zeppelin@EXAMPLE.COM</value> </property>
Check the Zeppelin Interpreter page to see if any interpreter (e.g. the Livy interpreter) is duplicated. This may happen in some cases. If duplicate interpreter entries are found, perform the following steps:
Backup and delete the interpreter.json file from HDFS (
/user/zeppelin/conf/interpreter.json
) and from the local Zeppelin server.Restart the Zeppelin service.
Verify that the duplicate entries no longer exist.
If any custom interpreter settings were present before the upgrade, add them again via the Zeppelin interpreter UI page.
Verify that your existing notebooks are available on Zeppelin.
Note When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook.
Use Local Storage when Upgrading to HDP-2.6.3+
Perform the following steps to use local notebook storage when upgrading to HDP 2.6.3+ from earlier versions of HDP.
Upgrade Ambari.
Upgrade HDP and Zeppelin. During the upgrade, verify that the following configuration settings are present in Ambari for Zeppelin.
zeppelin.notebook.storage = org.apache.zeppelin.notebook.repo.VFSNotebookRepo zeppelin.config.fs.dir = file:///etc/zeppelin/conf
If necessary, add or update these configuration settings as shown above.
After the upgrade is complete:
Copy your notebooks and the
notebook-authorization.json
file from the previous Zeppelin installation directory to the new installation directory on the Zeppelin server machine.Verify that your existing notebooks are available on Zeppelin.
Note When an existing notebook is opened for the first time after the upgrade, it may ask you to save the interpreters associated with the notebook.