This is the documentation for CDH 5.1.x. Documentation for other versions is available at Cloudera Documentation.

Hue Configuration

This section describes configuration you perform in the Hue configuration file hue.ini. The location of the Hue configuration file varies depending on how Hue is installed. The location of the Hue configuration folder is displayed when you view the Hue configuration.

  Note: Only the root user can edit the Hue configuration file.

You can configure the Hue apps using the properties described in the following sections:

Viewing the Hue Configuration

  Note: You must be a Hue superuser to view the Hue configuration.

When you log in to Hue, the start-up page displays information about any misconfiguration detected.

To view the Hue configuration, do one of the following:

  • Visit http://myserver:port and click the Configuration tab.
  • Visit http://myserver:port/dump_config.

Hue Server Configuration

This section describes Hue Server settings.

Specifying the Hue Server HTTP Address

These configuration properties are under the [desktop] section in the Hue configuration file.

Hue uses the CherryPy web server. You can use the following options to change the IP address and port that the web server listens on. The default setting is port 8888 on all configured IP addresses.
# Webserver listens on this address and port
http_host=0.0.0.0
http_port=8888

Specifying the Secret Key

For security, you should specify the secret key that is used for secure hashing in the session store:

  1. Open the Hue configuration file.
  2. In the [desktop] section, set the secret_key property to a long series of random characters (30 to 60 characters is recommended). For example,
    secret_key=qpbdxoewsqlkhztybvfidtvwekftusgdlofbcfghaswuicmqp
      Note: If you don't specify a secret key, your session cookies will not be secure. Hue will run but it will also display error messages telling you to set the secret key.

Authentication

By default, the first user who logs in to Hue can choose any username and password and automatically becomes an administrator. This user can create other user and administrator accounts. Hue users should correspond to the Linux users who will use Hue; make sure you use the same name as the Linux username.

By default, user information is stored in the Hue database. However, the authentication system is pluggable. You can configure authentication to use an LDAP directory (Active Directory or OpenLDAP) to perform the authentication, or you can import users and groups from an LDAP directory. See Configuring an LDAP Server for User Admin.

For more information, see the Hue SDK Documentation.

Configuring the Hue Server for SSL

You can optionally configure Hue to serve over HTTPS. As of CDH 5, pyOpenSSL is now part of the Hue build and does not need to be installed manually. To configure SSL, perform the following steps from the root of your Hue installation path:

  1. Configure Hue to use your private key by adding the following options to the Hue configuration file:
    ssl_certificate=/path/to/certificate
    ssl_private_key=/path/to/key
      Note: Hue can only support a private key without a passphrase.
  2. On a production system, you should have an appropriate key signed by a well-known Certificate Authority. If you're just testing, you can create a self-signed key using the openssl command that may be installed on your system:
    # Create a key
    $ openssl genrsa 1024 > host.key
    # Create a self-signed certificate
    $ openssl req -new -x509 -nodes -sha1 -key host.key > host.cert
      Note: Uploading files using the Hue File Browser over HTTPS requires using a proper SSL Certificate. Self-signed certificates don't work.

Authentication Backend Options for Hue

The table below gives a list of authentication backends Hue can be configured with including the recent SAML backend that enables single sign-on authentication. The backend configuration property is available in the [[auth]] section under [desktop].

backend

django.contrib.auth.backends.ModelBackend

This is the default authentication backend used by Django.

desktop.auth.backend.AllowAllBackend

This backend does not require a password for users to log in. All users are automatically authenticated and the username is set to what is provided.

desktop.auth.backend.AllowFirstUserDjangoBackend

This is the default Hue backend. It creates the first user that logs in as the super user. After this, it relies on Django and the user manager to authenticate users.

desktop.auth.backend.LdapBackend

Authenticates users against an LDAP service.

desktop.auth.backend.PamBackend

Authenticates users with PAM (pluggable authentication module). The authentication mode depends on the PAM module used.

desktop.auth.backend.SpnegoDjangoBackend

SPNEGO is an authentication mechanism negotiation protocol. Authentication can be delegated to an authentication server, such as a Kerberos KDC, depending on the mechanism negotiated.

desktop.auth.backend.RemoteUserDjangoBackend

Authenticating remote users with the Django backend. See the Django documentation for more details.

desktop.auth.backend.OAuthBackend

Delegates authentication to a third-party OAuth server.

libsaml.backend.SAML2Backend

Secure Assertion Markup Language (SAML) single sign-on (SSO) backend. Delegates authentication to the configured Identity Provider. See Configuring Hue for SAML for more details.

  Note: All backends that delegate authentication to a third-party authentication server eventually import users into the Hue database. While the metadata is stored in the database, user authentication will still take place outside Hue.

Beeswax Configuration

In the [beeswax] section of the configuration file, you can optionally specify the following:

hive_server_host

The fully-qualified domain name or IP address of the host running HiveServer2.

hive_server_port

The port of the HiveServer2 Thrift server.

Default: 10000.

hive_conf_dir

The directory containing hive-site.xml, the HiveServer2 configuration file.

Cloudera Impala Query UI Configuration

In the [impala] section of the configuration file, you can optionally specify the following:

server_host

The hostname or IP address of the Impala Server.

Default: localhost.

server_port

The port of the Impalad Server.

Default: 21050

impersonation_enabled

Turn on/off impersonation mechanism when talking to Impala.

Default: False

DB Query Configuration

The DB Query app can have any number of databases configured in the [[databases]] section under [librdbms]. A database is known by its section name (sqlite, mysql, postgresql, and oracle as in the list below).

Database Type Configuration Properties

SQLite: [[[sqlite]]]

# Name to show in the UI.
## nice_name=SQLite

# For SQLite, name defines the path to the database.
## name=/tmp/sqlite.db

# Database backend to use.
## engine=sqlite

MySQL, Oracle or PostgreSQL:

[[[mysql]]]

  Note: Replace with oracle or postgresql as required.
# Name to show in the UI.
## nice_name="My SQL DB"

# For MySQL and PostgreSQL, name is the name of the database.
# For Oracle, Name is instance of the Oracle server. For express edition
# this is 'xe' by default.
## name=mysqldb

# Database backend to use. This can be:
# 1. mysql
# 2. postgresql
# 3. oracle
## engine=mysql

# IP or hostname of the database to connect to.
## host=localhost

# Port the database server is listening to. Defaults are:
# 1. MySQL: 3306
# 2. PostgreSQL: 5432
# 3. Oracle Express Edition: 1521
## port=3306

# Username to authenticate with when connecting to the database.
## user=example

# Password matching the username to authenticate with when
# connecting to the database.
## password=example

Pig Editor Configuration

In the [pig] section of the configuration file, you can optionally specify the following:

remote_data_dir

Location on HDFS where the Pig examples are stored.

Sqoop Configuration

In the [sqoop] section of the configuration file, you can optionally specify the following:

server_url

The URL of the sqoop2 server.

Job Browser Configuration

By default, any user can see submitted job information for all users. You can restrict viewing of submitted job information by optionally setting the following property under the [jobbrowser] section in the Hue configuration file:

share_jobs

Indicate that jobs should be shared with all users. If set to false, they will be visible only to the owner and administrators.

Job Designer

In the [jobsub] section of the configuration file, you can optionally specify the following:

remote_data_dir

Location in HDFS where the Job Designer examples and templates are stored.

Oozie Editor/Dashboard Configuration

By default, any user can see all workflows, coordinators, and bundles. You can restrict viewing of workflows, coordinators, and bundles by optionally specifying the following property under the [oozie] section of the Hue configuration file:

share_jobs

Indicate that workflows, coordinators, and bundles should be shared with all users. If set to false, they will be visible only to the owner and administrators.

oozie_jobs_count

Maximum number of Oozie workflows or coordinators or bundles to retrieve in one API call.

remote_data_dir

The location in HDFS where Oozie workflows are stored.

Also see Liboozie Configuration

Search Configuration

In the [search] section of the configuration file, you can optionally specify the following:

security_enabled

Indicate whether Solr requires clients to perform Kerberos authentication.

empty_query

Query sent when no term is entered.

Default: *:*.

solr_url

URL of the Solr server.

HBase Configuration

In the [hbase] section of the configuration file, you can optionally specify the following:

truncate_limit

Hard limit of rows or columns per row fetched before truncating.

Default: 500

hbase_clusters

Comma-separated list of HBase Thrift servers for clusters in the format of "(name|host:port)".

Default: (Cluster|localhost:9090)

User Admin Configuration

In the [useradmin] section of the configuration file, you can optionally specify the following:

default_user_group

The name of the group to which a manually created user is automatically assigned.

Default: default.

Configuring an LDAP Server for User Admin

User Admin can interact with an LDAP server, such as Active Directory, in one of two ways:

  • You can import user and group information from your current Active Directory infrastructure using the LDAP Import feature in the User Admin application. User authentication is then performed by User Admin based on the imported user and password information. You can then manage the imported users, along with any users you create directly in User Admin. See Enabling Import of Users and Groups from an LDAP Directory.
  • You can configure User Admin to use an LDAP server as the authentication back end, which means users logging in to Hue will authenticate to the LDAP server, rather than against a username and password kept in User Admin. In this scenario, your users must all reside in the LDAP directory. See Enabling the LDAP Server for User Authentication for further information.

Enabling Import of Users and Groups from an LDAP Directory

User Admin can import users and groups from an Active Directory via the Lightweight Directory Authentication Protocol (LDAP). In order to use this feature, you must configure User Admin with a set of LDAP settings in the Hue configuration file.

  Note: If you import users from LDAP, you must set passwords for them manually; password information is not imported.

To enable LDAP import of users and groups:

  1. In the Hue configuration file, configure the following properties in the [[ldap]] section:

    Property

    Description

    Example

    base_dn

    The search base for finding users and groups.

    base_dn="DC=mycompany,DC=com"

    nt_domain

    The NT domain to connect to (only for use with Active Directory).

    nt_domain=mycompany.com

    ldap_url

    URL of the LDAP server.

    ldap_url=ldap://auth.mycompany.com

    ldap_cert

    Path to certificate for authentication over TLS (optional).

    ldap_cert=/mycertsdir/myTLScert

    bind_dn

    Distinguished name of the user to bind as – not necessary if the LDAP server supports anonymous searches.

    bind_dn="CN=ServiceAccount,DC=mycompany,DC=com"

    bind_password

    Password of the bind user – not necessary if the LDAP server supports anonymous searches.

    bind_password=P@ssw0rd

  2. Configure the following properties in the [[[users]]] section:

    Property

    Description

    Example

    user_filter

    Base filter for searching for users.

    user_filter="objectclass=*"

    user_name_attr

    The username attribute in the LDAP schema.

    user_name_attr=sAMAccountName

  3. Configure the following properties in the [[[groups]]] section:

    Property

    Description

    Example

    group_filter

    Base filter for searching for groups.

    group_filter="objectclass=*"

    group_name_attr

    The username attribute in the LDAP schema.

    group_name_attr=cn

  Note: If you provide a TLS certificate, it must be signed by a Certificate Authority that is trusted by the LDAP server.

Enabling the LDAP Server for User Authentication

You can configure User Admin to use an LDAP server as the authentication back end, which means users logging in to Hue will authenticate to the LDAP server, rather than against usernames and passwords managed by User Admin.

  Important: Be aware that when you enable the LDAP back end for user authentication, user authentication by User Admin will be disabled. This means there will be no superuser accounts to log into Hue unless you take one of the following actions:
  • Import one or more superuser accounts from Active Directory and assign them superuser permission.
  • If you have already enabled the LDAP authentication back end, log into Hue using the LDAP back end, which will create a LDAP user. Then disable the LDAP authentication back end and use User Admin to give the superuser permission to the new LDAP user.
After assigning the superuser permission, enable the LDAP authentication back end.

To enable the LDAP server for user authentication:

  1. In the Hue configuration file, configure the following properties in the [[ldap]] section:

    Property

    Description

    Example

    ldap_url

    URL of the LDAP server, prefixed by ldap:// or ldaps://

    ldap_url=ldap://auth.mycompany.com

    search_bind_ authentication

    Search bind authentication is now the default instead of direct bind. To revert to direct bind, the value of this property should be set to false. When using search bind semantics, Hue will ignore the following nt_domain and ldap_username_pattern properties.

    search_bind_authentication=
    false

    nt_domain

    The NT domain over which the user connects (not strictly necessary if using ldap_username_pattern.

    nt_domain=mycompany.com

    ldap_username_ pattern

    Pattern for searching for usernames – Use <username> for the username parameter. For use when using LdapBackend for Hue authentication

    ldap_username_pattern=
    "uid=<username>,ou=People,dc=mycompany,dc=com"
  2. If you are using TLS or secure ports, add the following property to specify the path to a TLS certificate file:

    Property

    Description

    Example

    ldap_cert

    Path to certificate for authentication over TLS.

      Note: If you provide a TLS certificate, it must be signed by a Certificate Authority that is trusted by the LDAP server.
    ldap_cert=/mycertsdir/myTLScert
  3. In the[[auth]] sub-section inside [desktop] change the following:

    backend

    Change the setting of backend from
    backend=desktop.auth.backend.AllowFirstUserDjangoBackend
    to
    backend=desktop.auth.backend.LdapBackend

Hadoop Configuration

The following configuration variables are under the [hadoop] section in the Hue configuration file.

HDFS Cluster Configuration

Hue currently supports only one HDFS cluster, which you define under the [[hdfs_clusters]] sub-section. The following properties are supported:

[[[default]]]

The section containing the default settings.

fs_defaultfs

The equivalent of fs.defaultFS (also referred to as fs.default.name) in a Hadoop configuration.

webhdfs_url

The HttpFS URL. The default value is the HTTP port on the NameNode.

YARN (MRv2) and MapReduce (MRv1) Cluster Configuration

Job Browser can display both MRv1 and MRv2 jobs, but must be configured to display one type at a time by specifying either [[yarn_clusters]] or [[mapred_clusters]] sections in the Hue configuration file.

The following YARN cluster properties are defined under the under the [[yarn_clusters]] sub-section:

[[[default]]]

The section containing the default settings.

resourcemanager_host

The fully-qualified domain name of the host running the ResourceManager.

resourcemanager_port

The port for the ResourceManager IPC service.

submit_to

If your Oozie is configured to use a YARN cluster, then set this to true. Indicate that Hue should submit jobs to this YARN cluster.

proxy_api_url

URL of the ProxyServer API.

Default: http://localhost:8088

history_server_api_url

URL of the HistoryServer API

Default: http://localhost:19888

The following MapReduce cluster properties are defined under the [[mapred_clusters]] sub-section:

[[[default]]]

The section containing the default settings.

jobtracker_host

The fully-qualified domain name of the host running the JobTracker.

jobtracker_port

The port for the JobTracker IPC service.

submit_to

If your Oozie is configured with to use a 0.20 MapReduce service, then set this to true. Indicate that Hue should submit jobs to this MapReduce cluster.

  Note: High Availability (MRv1):
Add High Availability (HA) support for your MRv1 cluster by specifying a failover JobTracker. You can do this by configuring the following property under the [[[ha]]] sub-section for MRv1.
# Enter the host on which you are running the failover JobTracker
# jobtracker_host=<localhost-ha>

High Availability (YARN):

Add the following [[[ha]]] section under the [hadoop] > [[yarn_clusters]] sub-section in hue.ini with configuration properties for a second ResourceManager. As long as you have the logical_name property specified as below, jobs submitted to Oozie will work. The Job Browser, however, will not work with HA in this case.
[[[ha]]]
resourcemanager_host=<second_resource_manager_host_FQDN>
resourcemanager_api_url=http://<second_resource_manager_host_URL>
proxy_api_url=<second_resource_manager_proxy_URL>
history_server_api_url=<history_server_API_URL>
resourcemanager_port=<port_for_RM_IPC>
security_enabled=false
submit_to=true
logical_name=XXXX

Liboozie Configuration

In the [liboozie] section of the configuration file, you can optionally specify the following:

security_enabled

Indicate whether Oozie requires clients to perform Kerberos authentication.

remote_deployement_dir

The location in HDFS where the workflows and coordinators are deployed when submitted by a non-owner.

oozie_url

The URL of the Oozie server.

ZooKeeper Configuration

In the [zookeeper] section of the configuration file, you can optionally specify the following:

host_ports

Comma-separated list of ZooKeeper servers in the format "host:port".

Example: localhost:2181,localhost:2182,localhost:2183

rest_url

The URL of the REST Contrib service (required for znode browsing).

Default: http://localhost:9998

Setting up REST Service for ZooKeeper

ZooKeeper Browser requires the ZooKeeper REST service to be running. Follow the instructions below to set this up.

Step 1: Git and build the ZooKeeper repository

git clone https://github.com/apache/zookeeper
cd zookeeper
ant
Buildfile: /home/hue/Development/zookeeper/build.xml

init:
[mkdir] Created dir: /home/hue/Development/zookeeper/build/classes
[mkdir] Created dir: /home/hue/Development/zookeeper/build/lib
[mkdir] Created dir: /home/hue/Development/zookeeper/build/package/lib
[mkdir] Created dir: /home/hue/Development/zookeeper/build/test/lib
…

Step 2: Start the REST service

cd src/contrib/rest
nohup ant run&

Step 3: Update ZooKeeper configuration properties (if required)

If ZooKeeper and the REST service are not on the same machine as Hue, update the Hue configuration file and specify the correct hostnames and ports as shown in the sample configuration below:

[zookeeper]
	...
	[[clusters]]
		...
		[[[default]]]
          # Zookeeper ensemble. Comma separated list of Host/Port.
          # e.g. localhost:2181,localhost:2182,localhost:2183
          ## host_ports=localhost:2181

          # The URL of the REST contrib service
          ## rest_url=http://localhost:9998

You should now be able to successfully run the ZooKeeper Browser app.

Page generated September 3, 2015.