Note | |
---|---|
If you are using an Ambari-managed cluster, use Ambari to to update core-site.xml, mapred-site.xml and oozie-site.xml according to the instructions below. Do not make changes to the files directly since Ambari will overwrite them. |
Use the following instructions to manually set up the configuration files:
On the NameNode, Secondary NameNode, and all DataNodes, modify the configuration files as instructed below:
Modify
file:$HADOOP_CONF_DIR
/core-site.xml<property> <name>hadoop.proxyuser.hue.hosts</name> <value>*</value> </property>
<property> <name>hadoop.proxyuser.hue.groups</name> <value>*</value> </property>
Modify the
file.$HADOOP_CONF_DIR
/hdfs-site.xml<property> <name>dfs.support.broken.append</name> <value>true</value> <final>true</final> </property>
Use WebHDFS/HttpFS to access HDFS data:
Option I: Configure WebHDFS (recommended)
Modify the
file on the NameNode and all DataNodes:$HADOOP_CONF_DIR
/hdfs-site.xml<property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property>
Option II: Configure HttpFS (remote access)
If you are using a remote Hue Server, you can run an HttpFS server to provide Hue access to HDFS.
Add the following properties
/etc/hadoop-httpfs/conf/httpfs-site.xml
file:<property> <name>httpfs.proxyuser.hue.hosts</name> <value>*</value> </property>
<property> <name>httpfs.proxyuser.hue.groups</name> <value>*</value> </property>
Modify the
webhcat-site.xml
file.On the WebHCat Server host, add the following properties to the
$WEBHCAT_CONF_DIR/webhcat-site.xml
, where $WEBHCAT_CONF_DIR is the directory for storing WebHCat configuration files. For example,/etc/webhcat/conf
.vi $WEBHCAT_CONF_DIR/webhcat-site.xml <property> <name>webhcat.proxyuser.hue.hosts</name> <value>*</value> </property> <property> <name>webhcat.proxyuser.hue.groups</name> <value>*</value> </property>
[Optional] - If you are setting
in your$HADOOP_CLASSPATH
file, verify that your settings preserve the user-specified options.$HADOOP_CONF_DIR
/hadoop-env.shFor example, the following sample illustrates correct settings for
:$HADOOP_CLASSPATH
# HADOOP_CLASSPATH=<your_additions>:$HADOOP_CLASSPATH
This setting lets certain Hue components add the Hadoop CLASSPATH using the environment variable.
[Optional] - Enable job submission using both Hue and the command line interface (CLI).
The
hadoop.tmp.dir
is used to unpack JAR files in/usr/lib/hadoop/lib
JAR.If you start using both Hue and command line interface for job submission it leads to contention for the
hadoop.tmp.dir
directory. By default,hadoop.tmp.dir
is at/tmp/hadoop-
.$USER_NAME
To enable job submission using both Hue and CLI, update the following property in the
file:$HADOOP_CONF_DIR
/core-site.xml<property> <name>hadoop.tmp.dir</name> <value>/tmp/hadoop-$USER_NAME$HUE_SUFFIX</value> </property>
where
$HADOOP_CONF_DIR
is the directory for storing the Hadoop configuration files, for example,/etc/hadoop/conf
.
Install Hue-plugins
Verify that all the services are stopped. See the instructions provided here.
Install Hue-plugins. On the JobTracker host machine, execute the following command:
For RHEL/CentOS:
yum install hue-plugins
For SLES:
zypper install hue-plugins
Verify that Hue-plugins JAR file is available in the Hadoop
lib
directory (located atusr/lib/hadoop/lib
)Add the following properties to
on the JobTracker host machine:$HADOOP_CONF_DIR
/mapred-site.xml<property> <name>jobtracker.thrift.address</name> <value>0.0.0.0:9290</value> </property>
<property> <name>mapreduce.jobtracker.plugins</name> <value>org.apache.hadoop.thriftfs.ThriftJobTrackerPlugin</value> <description>Comma-separated list of jobtracker plugins to be activated.</description> </property>
$HADOOP_CONF_DIR
is the directory for storing the Hadoop configuration files, for example,/etc/hadoop/conf
.
Configure Oozie.
On the Oozie server host machine, modify
as shown below:OOZIE_CONF_DIR
/oozie-site.xml<property> <name>oozie.service.ProxyUserService.proxyuser.hue.hosts</name> <value>*</value> </property>
<property> <name>oozie.service.ProxyUserService.proxyuser.hue.groups</name> <value>*</value> </property>
where
OOZIE_CONF_DIR
is the directory to store the Oozie configuration files. For example,/etc/oozie/conf
.Restart all the services in your cluster. For more information use the instructions provided here.