Using an External Database for Hue Using the Command Line
Continue reading:
Prerequisites
Before using an external database with Hue, install all of the support libraries required by your operating system. See Development Preferences in the Hue documentation for the full list.
Embedded Database
By default, Hue is configured to use the embedded database SQLite for this purpose, and should require no configuration or management by the administrator.
Inspecting the Embedded Hue Database
The default SQLite database used by Hue is located in /var/lib/hue/desktop.db. You can inspect this database from the command line using the sqlite3 program. For example:
# sqlite3 /var/lib/hue/desktop.db SQLite version 3.6.22 Enter ".help" for instructions Enter SQL statements terminated with a ";" sqlite> select username from auth_user; admin test sample sqlite>
External Database
Cloudera strongly recommends an external database for clusters with multiple Hue users, especially clusters in a production environment. The default database, SQLite, cannot support large data migrations. Hue supports MySQL, PostgreSQL, and Oracle. See Supported Databases for the supported versions.
In the instructions that follow, dumping the database and editing the JSON objects is only necessary if you have data in SQLite that you need to migrate. If you do not need to migrate data from SQLite, you can skip those steps.
Configuring the Hue Server to Store Data in MySQL
- Shut down the Hue server if it is running.
- Dump the existing database data to a text file. Note that using the .json extension is required.
$ sudo -u hue <HUE_HOME>/build/env/bin/hue dumpdata > <some-temporary-file>.json
- Open <some-temporary-file>.json and remove all JSON objects with useradmin.userprofile in the model field. Here are some examples of JSON objects that should be deleted.
{ "pk": 1, "model": "useradmin.userprofile", "fields": { "creation_method": "HUE", "user": 1, "home_directory": "/user/alice" } }, { "pk": 2, "model": "useradmin.userprofile", "fields": { "creation_method": "HUE", "user": 1100714, "home_directory": "/user/bob" } }, .....
- Start the Hue server.
- Install the MySQL client developer package.
OS Command RHEL
$ sudo yum install mysql-devel
SLES
$ sudo zypper install mysql-devel
Ubuntu or Debian
$ sudo apt-get install libmysqlclient-dev
- Install the MySQL connector.
OS Command RHEL
$ sudo yum install mysql-connector-java
SLES
$ sudo zypper install mysql-connector-java
Ubuntu or Debian
$ sudo apt-get install libmysql-java
- Install and start MySQL.
OS Command RHEL
$ sudo yum install mysql-server
SLES
$ sudo zypper install mysql $ sudo zypper install libmysqlclient_r15
Ubuntu or Debian
$ sudo apt-get install mysql-server
- Change the /etc/my.cnf file as follows:
[mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock bind-address=<ip-address> default-storage-engine=InnoDB sql_mode=STRICT_ALL_TABLES
- Start the mysql daemon.
OS Command RHEL
$ sudo service mysqld start
SLES and Ubuntu or Debian
$ sudo service mysql start
- Configure MySQL to use a strong password. In the following procedure, your current root password is blank. Press the Enter key when you're prompted for the root password.
$ sudo /usr/bin/mysql_secure_installation [...] Enter current password for root (enter for none): OK, successfully used password, moving on... [...] Set root password? [Y/n] y New password: Re-enter new password: Remove anonymous users? [Y/n] Y [...] Disallow root login remotely? [Y/n] N [...] Remove test database and access to it [Y/n] Y [...] Reload privilege tables now? [Y/n] Y All done!
- Configure MySQL to start at boot.
OS Command RHEL
$ sudo /sbin/chkconfig mysqld on $ sudo /sbin/chkconfig --list mysqld mysqld 0:off 1:off 2:on 3:on 4:on 5:on 6:off
SLES
$ sudo chkconfig --add mysql
Ubuntu or Debian
$ sudo chkconfig mysql on
- Create the Hue database and grant privileges to a hue user to manage the database.
mysql> create database hue; Query OK, 1 row affected (0.01 sec) mysql> grant all on hue.* to 'hue'@'localhost' identified by '<secretpassword>'; Query OK, 0 rows affected (0.00 sec)
- Open the Hue configuration file in a text editor.
- Edit the Hue configuration file hue.ini. Directly below the [[database]] section under the [desktop] line, add the following options (and modify accordingly for your setup):
host=localhost port=3306 engine=mysql user=hue password=<secretpassword> name=hue
- As the hue user, load the existing data and create the necessary database tables using syncdb and migrate commands. When running these commands, Hue will try to access a logs directory, located at /opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the case, first create the logs directory and give the hue user and group ownership of the directory.
$ sudo mkdir /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo chown hue:hue /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo -u hue <HUE_HOME>/build/env/bin/hue syncdb --noinput $ sudo -u hue <HUE_HOME>/build/env/bin/hue migrate $ mysql -u hue -p <secretpassword> mysql > SHOW CREATE TABLE auth_permission;
- (InnoDB only) Drop the foreign key.
mysql > ALTER TABLE auth_permission DROP FOREIGN KEY content_type_id_refs_id_XXXXXX;
- Delete the rows in the django_content_type table.
mysql > DELETE FROM hue.django_content_type;
- Load the data.
$ <HUE_HOME>/build/env/bin/hue loaddata <some-temporary-file>.json
- (InnoDB only) Add the foreign key.
$ mysql -u hue -p <secretpassword> mysql > ALTER TABLE auth_permission ADD FOREIGN KEY (`content_type_id`) REFERENCES `django_content_type` (`id`);
Configuring the Hue Server to Store Data in PostgreSQL
- Shut down the Hue server if it is running.
- Dump the existing database data to a text file. Note that using the .json extension is required.
$ sudo -u hue <HUE_HOME>/build/env/bin/hue dumpdata > <some-temporary-file>.json
- Open <some-temporary-file>.json and remove all JSON objects with useradmin.userprofile in the model field. Here are some examples of JSON objects that should be deleted.
{ "pk": 1, "model": "useradmin.userprofile", "fields": { "creation_method": "HUE", "user": 1, "home_directory": "/user/alice" } }, { "pk": 2, "model": "useradmin.userprofile", "fields": { "creation_method": "HUE", "user": 1100714, "home_directory": "/user/bob" } }, .....
- Install required packages.
OS Command RHEL
$ sudo yum install postgresql-devel gcc python-devel
SLES
$ sudo zypper install postgresql-devel gcc python-devel
Ubuntu or Debian
$ sudo apt-get install postgresql-devel gcc python-devel
- Install the module that provides the connector to PostgreSQL.
sudo -u hue <HUE_HOME>/build/env/bin/pip install setuptools sudo -u hue <HUE_HOME>/build/env/bin/pip install psycopg2
- Install the PostgreSQL server.
OS Command RHEL
$ sudo yum install postgresql-server
SLES
$ sudo zypper install postgresql-server
Ubuntu or Debian
$ sudo apt-get install postgresql
- Initialize the data directories:
$ service postgresql initdb
- Configure client authentication.
- Edit /var/lib/pgsql/data/pg_hba.conf.
- Set the authentication methods for local to trust and for host to password and add the following line at the end.
host hue hue 0.0.0.0/0 md5
- Start the PostgreSQL server.
$ su - postgres # /usr/bin/postgres -D /var/lib/pgsql/data > logfile 2>&1 &
- Configure PostgreSQL to listen on all network interfaces.
Edit /var/lib/pgsql/data/postgresql.conf and set list_addresses:
listen_addresses = '0.0.0.0' # Listen on all addresses
- Create the hue database and grant privileges to a hue user to manage the database.
# psql -U postgres postgres=# create database hue; postgres=# \c hue; You are now connected to database 'hue'. postgres=# create user hue with password '<secretpassword>'; postgres=# grant all privileges on database hue to hue; postgres=# \q
- Restart the PostgreSQL server.
$ sudo service postgresql restart
- Verify connectivity.
psql -h localhost -U hue -d hue Password for user hue: <secretpassword>
- Configure the PostgreSQL server to start at boot.
OS Command RHEL
$ sudo /sbin/chkconfig postgresql on $ sudo /sbin/chkconfig --list postgresql postgresql 0:off 1:off 2:on 3:on 4:on 5:on 6:off
SLES
$ sudo chkconfig --add postgresql
Ubuntu or Debian
$ sudo chkconfig postgresql on
- Open the Hue configuration file in a text editor.
- Edit the Hue configuration file hue.ini. Directly below the [[database]] section under the [desktop] line, add the following options (and modify accordingly for your setup):
host=localhost port=5432 engine=postgresql_psycopg2 user=hue password=<secretpassword> name=hue
- As the hue user, configure Hue to load the existing data and create the necessary database tables. You will need to run both the migrate and syncdb commands. When running these commands, Hue will try to access a logs directory, located at
/opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the case, first create the logs directory and give the
hue user and group ownership of the directory.
$ sudo mkdir /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo chown hue:hue /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo -u hue <HUE_HOME>/build/env/bin/hue syncdb --noinput $ sudo -u hue <HUE_HOME>/build/env/bin/hue migrate
- Determine the foreign key ID.
bash# su - postgres $ psql -h localhost -U hue -d hue postgres=# \d auth_permission;
- Drop the foreign key that you retrieved in the previous step.
postgres=# ALTER TABLE auth_permission DROP CONSTRAINT content_type_id_refs_id_<XXXXXX>;
- Delete the rows in the django_content_type table.
postgres=# TRUNCATE django_content_type CASCADE;
- Load the data.
$ sudo -u hue <HUE_HOME>/build/env/bin/hue loaddata <some-temporary-file>.json
- Add back the foreign key you dropped.
bash# su - postgres $ psql -h localhost -U hue -d hue postgres=# ALTER TABLE auth_permission ADD CONSTRAINT content_type_id_refs_id_<XXXXXX> FOREIGN KEY (content_type_id) REFERENCES django_content_type(id) DEFERRABLE INITIALLY DEFERRED;
Configuring the Hue Server to Store Data in Oracle
- Ensure Python 2.6 or newer is installed on the server Hue is running on.
- Download the Oracle client libraries at Instant Client for Linux x86-64 Version 11.1.0.7.0, Basic and SDK (with headers) zip files to the same directory.
- Unzip the zip files.
- Set environment variables to reference the libraries.
$ export ORACLE_HOME=<download directory> $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME
- Create a symbolic link for the shared object:
$ cd $ORACLE_HOME $ ln -sf libclntsh.so.11.1 libclntsh.so
- Get a data dump by executing:
$ <HUE_HOME>/build/env/bin/hue dumpdata > <some-temporary-file>.json --indent 2
- Edit the Hue configuration file hue.ini. Directly below the [[database]] section under the [desktop] line, add the following options (and modify accordingly for your setup):
host=localhost port=1521 engine=oracle user=hue password=<secretpassword> name=<SID of the Oracle database, for example, 'XE'>
To use the Oracle service name instead of the SID, use the following configuration instead:port=0 engine=oracle user=hue password=password name=oracle.example.com:1521/orcl.example.com
The directive port=0 allows Hue to use a service name. The name string is the connect string, including hostname, port, and service name.
To add support for a multithreaded environment, set the threaded option to true under the [desktop]>[[database]] section.
options={'threaded':true}
- Grant required permissions to the Hue user in Oracle:
grant alter any index to hue; grant alter any table to hue; grant alter database link to hue; grant create any index to hue; grant create any sequence to hue; grant create database link to hue; grant create session to hue; grant create table to hue; grant drop any sequence to hue; grant select any dictionary to hue; grant drop any table to hue; grant create procedure to hue; grant create trigger to hue;
- As the hue user, configure Hue to load the existing data and create the necessary database tables. You will need to run both the syncdb and migrate commands. When running these commands, Hue will try to access a logs directory, located at
/opt/cloudera/parcels/CDH/lib/hue/logs, which might be missing. If that is the case, first create the logs directory and give the
hue user and group ownership of the directory.
$ sudo mkdir /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo chown hue:hue /opt/cloudera/parcels/CDH/lib/hue/logs $ sudo -u hue <HUE_HOME>/build/env/bin/hue syncdb --noinput $ sudo -u hue <HUE_HOME>/build/env/bin/hue migrate
- Ensure you are connected to Oracle as the hue user, then run the following command to delete all data from Oracle tables:
SELECT 'DELETE FROM ' || '.' || table_name || ';' FROM user_tables;
- Run the statements generated in the preceding step.
- Commit your changes.
commit;
- Load the data.
$ sudo -u hue <HUE_HOME>/build/env/bin/hue loaddata <some-temporary-file>.json