Configure Hadoop
These configuration variables are in the [hadoop]
section of the
/etc/hue/conf/hue.ini
configureation file:
Configure an HDFS cluster.
Hue only supports one HDFS cluster. Ensure that you define the HDFS cluster under the [hadoop][[hdfs_clusters]] [[[default]]] subsection of the
/etc/hue/config/hue.ini
configuration file.Use the following variables to configure the HDFS cluster:
Variable
Description
Default/Example Value
fs_defaultfs
This is equivalent to fs.defaultFS (fs.default.name) in the Hadoop configuration.
hdfs:// fqdn.namenode.host:8020
webhdfs_url
WebHDFS URL.
The default value is the HTTP port on the NameNode. Example: http://fqdn.namenode.host:50070/webhdfs/v1
Configure a YARN (MR2) Cluster.
Hue supports only one YARN cluster.
Ensure that you define the YARN cluster under the
[hadoop][[yarn_clusters]] [[[default]]]
sub-section of the/etc/hue/config/hue.ini
configuration file.For more information regarding how to configure Hue with a NameNode HA cluster see see Deploy Hue with a ResourceManager HA Cluster in the High Availabiltiy for Hadoop Guide.
Use the following variables to configure a YARN cluster:
Variable
Description
Default/Example Value
submit_to
Set this property to true. Hue will submit jobs to this YARN cluster. Note that JobBrowser will not be able to show MR2 jobs.
true
resourcemanager_api_url
The URL of the ResourceManager API.
http://fqdn.resourcemanager.host:8088
proxy_api_url
The URL of the ProxyServer API.
http://fqdn.resourcemanager.host:8088
history_server_api_url
The URL of the HistoryServer API.
http://fqdn.historyserver.host:19888
node_manager_api_url
The URL of the NodeManager API.
http://fqdn.resourcemanager.host:8042
Configure Beeswax
In the
[beeswax]
section of the of the/etc/hue/config/hue.ini
configuration file, you can specify the following values:Variable
Description
Default/Example Value
hive_server_host
Host where Hive server Thrift daemon is running. If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_port
Port on which HiveServer2 Thrift server runs.
10000
hive_conf_dir
Hive configuration directory where hive-site.xml is located.
/etc/hive/conf
server_conn_timeout
Timeout in seconds for Thrift calls to HiveServer2.
120
Important Depending on your environment and the Hive queries you run, queries might fail with an internal error processing query message.
Look for an error message java.lang.OutOfMemoryError:
GC overhead limit exceeded in the beeswax_serer.out log file. To increase the heap size to avoid this out of memory error, modify the hadoop-env.sh file and change the value of
HADOOP_CLIENT_OPTS.
Configure HiverServer2 over SSL (Optional)
Make the following changes to the /etc/hue/conf/hue.ini configuration file to configure Hue to communicate with HiverServer2 over SSL:
[[ssl]] SSL communication enabled for this server. enabled=falsePath to Certificate Authority certificates. cacerts=/etc/hue/cacerts.pemPath to the public certificate file. cert=/etc/hue/cert.pemChoose whether Hue should validate certificates received from the server. validate=true
Configure JobDesigner and Oozie
In the
[liboozie]
section of the/etc/hue/conf/hue.ini
configuration file, specify theoozie_url
, the URL of the Oozie service as specified by theOOZIE_URL
environment variable for Oozie.Configure WebHCat
In the
[hcatalog]
section of the/etc/hue/conf/hue.ini
configuration file, settempleton_url
, to the hostname or IP of the WebHCat server. An example could behttp:// hostname:50111/templeton/v1/
.