Command Line Installation
Also available as:
PDF
loading table of contents...

Set Directories and Permissions

Create directories and configure ownership and permissions on the appropriate hosts as described below. If any of these directories already exist, Hortonworks recommends that you delete them and recreate them.

Hortonworks provides a set of configuration files that represent a working WebHCat configuration. (See Download Companion Files. You can use these files as a reference point, however, you need to modify them to match your own cluster environment.

If you choose to use the provided configuration files to set up your WebHCat environment, complete the following steps to set up the WebHCat configuration files:

  1. Execute the following commands on your WebHCat server machine to create log and PID directories.

    mkdir -p $WEBHCAT_LOG_DIR
    chown -R $WEBHCAT_USER:$HADOOP_GROUP $WEBHCAT_LOG_DIR
    hmod -R 755 $WEBHCAT_LOG_DIR
    mkdir -p $WEBHCAT_PID_DIR
    chown -R $WEBHCAT_USER:$HADOOP_GROUP $WEBHCAT_PID_DIR
    chmod -R 755 $WEBHCAT_PID_DIR

    where:

    • $WEBHCAT_LOG_DIR is the directory to store the WebHCat logs. For example, /var/log/webhcat.

    • $WEBHCAT_PID_DIR is the directory to store the WebHCat process ID. For example, /var/run/webhcat.

    • $WEBHCAT_USER is the user owning the WebHCat services. For example, hcat.

    • $HADOOP_GROUP is a common group shared by services. For example, hadoop.

  2. Set permissions for the WebHCat server to impersonate users on the Hadoop cluster:

    1. Create a UNIX user to run the WebHCat server.

    2. Modify the Hadoop core-site.xml file and set the following properties:

      Table 12.1. Hadoop core-site.xml File Properties

      Variable

      Value

      hadoop.proxyuser.USER.groups

      A comma-separated list of the UNIX groups whose users are impersonated.

      hadoop.proxyuser.USER.hosts

      A comma-separated list of the hosts that run the HCatalog and JobTracker servers.


  3. If you are running WebHCat on a secure cluster, create a Kerberos principal for the WebHCat server with the name USER/host@realm, and set the WebHCat configuration variables templeton.kerberos.principal and templeton.kerberos.keytab.