Set Directories and Permissions
Create directories and configure ownership and permissions on the appropriate hosts as described below. If any of these directories already exist, Hortonworks recommends that you delete them and recreate them.
Hortonworks provides a set of configuration files that represent a working WebHCat configuration. (See Download Companion Files. You can use these files as a reference point. However, you will need to modify them to match your own cluster environment.
If you choose to use the provided configuration files to set up your WebHCat environment, complete the following steps to set up the WebHCat configuration files:
Execute the following commands on your WebHCat server machine to create log and PID directories.
mkdir -p $WEBHCAT_LOG_DIR chown -R $WEBHCAT_USER:$HADOOP_GROUP $WEBHCAT_LOG_DIR hmod -R 755 $WEBHCAT_LOG_DIR
mkdir -p $WEBHCAT_PID_DIR chown -R $WEBHCAT_USER:$HADOOP_GROUP $WEBHCAT_PID_DIR chmod -R 755 $WEBHCAT_PID_DIR
where:
$WEBHCAT_LOG_DIR is the directory to store the WebHCat logs. For example,
/var/log/webhcat
.$WEBHCAT_PID_DIR is the directory to store the WebHCat process ID. For example,
/var/run/webhcat
.$WEBHCAT_USER is the user owning the WebHCat services. For example, hcat.
$HADOOP_GROUP is a common group shared by services. For example, hadoop.
Set permissions for the WebHCat server to impersonate users on the Hadoop cluster:
Create a UNIX user to run the WebHCat server.
Modify the Hadoop
core-site.xml
file and set the following properties:Table 11.1. Hadoop core-site.xml File Properties
Variable
Value
hadoop.proxyuser.USER.groups
A comma-separated list of the UNIX groups whose users will be impersonated.
hadoop.proxyuser.USER.hosts
A comma-separated list of the hosts that will run the HCatalog and JobTracker servers.
If you are running WebHCat on a secure cluster, create a Kerberos principal for the WebHCat server with the name USER/host@realm, and set the WebHCat configuration variables templeton.kerberos.principal and templeton.kerberos.keytab.