16. Configure and Start Apache WebHCat (Templeton)

RHEL/CentOS/Oracle Linux

  1. Copy the appropriate configurations from /etc/hcatalog/conf to /etc/hive- webhcat/conf/.

  2. Copy the new Pig, Hive and Hadoop-streaming jars to HDFS using the path you specified in ./etc/hive-webhcat/conf/ and change ownership to the hcat user with 755 permissions. For example:

    hdfs dfs -copyFromLocal /usr/share/HDP-webhcat/hive.tar.gz /usr/share/HDP-webhcat/pig.tar.gz/usr/hdp/version/hadoop-mapreduce/hadoop-streaming.jar hdfs:///apps/webhcat/.

    hdfs dfs -chmod -R 755 hdfs:///apps/webhcat/*

    hdfs dfs -chown -R hcat hdfs:///apps/webhcat/*

  3. Replace your WebHCat configuration after upgrading. Copy your modified/etc/ webhcat/conf from the template to the configuration directory in all your WebHCat hosts.

  4. Start WebHCat:

    sudo su -l $WEBHCAT_USER -c "//hive-hcatalog/sbin/webhcat_server.sh start"

  5. Smoke test WebHCat.

    On the WebHCat host machine, run the following command:

    http://$WEBHCAT_HOST_MACHINE:50111/templeton/v1/status

    If you are using a secure cluster, run the following command:

    curl --negotiate -u:http://cluster.$PRINCIPAL.$REALM:50111/templeton/v1/ status{"status":"ok","version":"v1"}[machine@acme]$

  6. Remove shared libraries from old Templeton installation.

    On the WebHCat host machine, run the following command:

    sudo su -l $HDFS_USER -c "hdfs dfs -rmr -skipTrash /apps/templeton" rm -rf /usr/share/HDP-templeton

    where

    • $WEBHCAT_USERis the WebHCat Service user. For example, hcat.

    • $HDFS_USERis the HDFS Service user. For example, hdfs.

SLES

  1. Copy the appropriate configurations from /etc/hcatalog/conf to /etc/hive-webhcat/conf/.

  2. Copy the new Pig, Hive and Hadoop-streaming jars to HDFS using the path you specified in ./etc/hive-webhcat/conf/ and change ownership to the hcat user with 755 permissions. For example:

    hdfs dfs -copyFromLocal /usr/share/HDP-webhcat/hive.tar.gz /usr/share/HDP-webhcat/pig.tar.gz/usr/hdp/version/hadoop-mapreduce/hadoop-streaming.jar hdfs:///apps/webhcat/.

    hdfs dfs -chmod -R 755 hdfs:///apps/webhcat/*

    hdfs dfs -chown -R hcat hdfs:///apps/webhcat/*

  3. Replace your WebHCat configuration after upgrading. Copy your modified /etc/webhcat/conf from the template to the configuration directory in all your WebHCat hosts.

  4. Modify the WebHCat configuration files.

    • Upload Pig, Hive and Sqoop tarballs to HDFS as the $HDFS_User. In this example, hdfs:

      hdfs dfs -mkdir -p /hdp/apps/2.2.0.0-<$version>/pig/
      hdfs dfs -mkdir -p /hdp/apps/2.2.0.0-<$version>/hive/
      hdfs dfs -mkdir -p /hdp/apps/2.2.0.0-<$version>/sqoop/
      hdfs dfs -put /usr/hdp/2.2.0.0-<$version>/pig/pig.tar.gz /hdp/apps/2.2.0.0-<$version>/pig/
      hdfs dfs -put /usr/hdp/2.2.0.0-<$version>/hive/hive.tar.gz /hdp/apps/2.2.0.0-<$version>/hive/
      hdfs dfs -put /usr/hdp/2.2.0.0-<$version>/sqoop/sqoop.tar.gz /hdp/apps/2.2.0.0-<$version>/sqoop/
      hdfs dfs -chmod -R 555 /hdp/apps/2.2.0.0-<$version>/pig
      hdfs dfs -chmod -R 444 /hdp/apps/2.2.0.0-<$version>/pig/pig.tar.gz
      hdfs dfs -chmod -R 555 /hdp/apps/2.2.0.0-<$version>/hive
      hdfs dfs -chmod -R 444 /hdp/apps/2.2.0.0-<$version>/hive/hive.tar.gz
      hdfs dfs -chmod -R 555 /hdp/apps/2.2.0.0-<$version>/sqoop
      hdfs dfs -chmod -R 444 /hdp/apps/2.2.0.0-<$version>/sqoop/sqoop.tar.gz
      hdfs dfs -chown -R hdfs:hadoop /hdp
    • Update the following properties in the webhcat-site.xml configuration file, as their values have changed:

      <property>
       <name>templeton.pig.archive</name>
       <value>hdfs:///hdp/apps/${hdp.version}/pig/pig.tar.gz</value>
      </property>
       
      </property>
       <name>templeton.hive.archive</name>
       <value>hdfs:///hdp/apps/${hdp.version}/hive/hive.tar.gz</value>
      </property>
       
      </property>
       <name>templeton.streaming.jar</name>
       <value>hdfs:///hdp/apps/${hdp.version}/mapreduce/hadoop-streaming.jar</value>
       <description>The hdfs path to the Hadoop streaming jar file.</description>
      </property>
       
      </property>
       <name>templeton.sqoop.archive</name>
       <value>hdfs:///hdp/apps/${hdp.version}/sqoop/sqoop.tar.gz</value>
       <description>The path to the Sqoop archive.</description>
      </property>
       
      </property>
       <name>templeton.sqoop.path</name>
       <value>sqoop.tar.gz/sqoop/bin/sqoop</value>
       <description>The path to the Sqoop executable.</description>
      </property>
       
      </property>
       <name>templeton.sqoop.home</name>
       <value>sqoop.tar.gz/sqoop</value>
       <description>The path to the Sqoop home in the exploded archive.</description>
      </property>
      [Note]Note

      You do not need to modify ${hdp.version}.

    • Remove the following obsolete properties from webhcat-site.xml:

      <property>
       <name>templeton.controller.map.mem</name>
       <value>1600</value>
       <description>Total virtual memory available to map tasks.</description>
      </property>
      
      </property>
       <name>hive.metastore.warehouse.dir</name>
       <value>/path/to/warehouse/dir</value>
      </property> 
    • Add new proxy users, if needed. In core-site.xml, make sure the following properties are also set to allow WebHcat to impersonate your additional HDP 2.2 groups and hosts:

      <property>
       <name>hadoop.proxyuser.hcat.groups</name>
       <value>*</value>
      </property> 
       
      </property>
       <name>hadoop.proxyuser.hcat.hosts</name>
       <value>*</value>
      </property>

      Where:

      hadoop.proxyuser.hcat.group

      Is a comma-separated list of the Unix groups whose users may be impersonated by 'hcat'.

      hadoop.proxyuser.hcat.hosts

      A comma-separated list of the hosts which are allowed to submit requests by 'hcat'.

  5. Start WebHCat:

    su -l hcat -c "/usr/hdp/current/hive-webhcat/sbin/webhcat_server.sh start"

  6. Smoke test WebHCat.

    On the WebHCat host machine, run the following command:

    http://$WEBHCAT_HOST_MACHINE:50111/templeton/v1/status

    If you are using a secure cluster, run the following command:

    curl --negotiate -u:http://cluster.$PRINCIPAL.$REALM:50111/templeton/v1/ status{"status":"ok","version":"v1"}[machine@acme]$

  7. Remove shared libraries from old Templeton installation.

    On the WebHCat host machine, run the following command:

    sudo su -l $HDFS_USER -c "hdfs dfs -rmr -skipTrash /apps/templeton" rm -rf /usr/share/HDP-templeton

    where

    • $WEBHCAT_USER is the WebHCat Service user. For example, hcat.

    • $HDFS_USER is the HDFS Service user. For example, hdfs.

Ubuntu/Debian

  1. Copy the appropriate configurations from /etc/hcatalog/conf to /etc/hive- webhcat/conf/.

  2. Copy the new Pig, Hive and Hadoop-streaming jars to HDFS using the path you specified in ./etc/hive-webhcat/conf/ and change ownership to the hcat user with 755 permissions. For example:

    hdfs dfs -copyFromLocal /usr/share/HDP-webhcat/hive.tar.gz /usr/share/HDP- webhcat/pig.tar.gz/usr/hdp/version/hadoop-mapreduce/hadoop-streaming.jar hdfs:///apps/webhcat/.

    hdfs dfs -chmod -R 755 hdfs:///apps/webhcat/*

    hdfs dfs -chown -R hcat hdfs:///apps/webhcat/*

  3. Replace your WebHCat configuration after upgrading. Copy your modified/etc/ webhcat/conf from the template to the configuration directory in all your WebHCat hosts.

  4. Start WebHCat:

    sudo su -l $WEBHCAT_USER -c "/usr/lib/hive-hcatalog/sbin/webhcat_server.sh start"

  5. Smoke test WebHCat.

    On the WebHCat host machine, run the following command:

    http://$WEBHCAT_HOST_MACHINE:50111/templeton/v1/status

    If you are using a secure cluster, run the following command:

    curl --negotiate -u:http://cluster.$PRINCIPAL.$REALM:50111/templeton/v1/ status{"status":"ok","version":"v1"}[machine@acme]$

  6. Remove shared libraries from old Templeton installation.

    On the WebHCat host machine, run the following command:

    sudo su -l $HDFS_USER -c "hdfs dfs -rmr -skipTrash /apps/templeton" rm -rf /usr/share/HDP-templeton

    where

    • $WEBHCAT_USER is the WebHCat Service user. For example, hcat.

    • $HDFS_USER is the HDFS Service user. For example, hdfs.


loading table of contents...