Configure and Start Apache WebHCat
Before you can upgrade Apache WebHCat, you must have first upgraded your HDP components to the latest version (in this case, 2.4.2). This section assumes that you have already upgraded your components for HDP 2.4.2. If you have not already completed these steps, return to Getting Ready to Upgrade and Upgrade 2.2 Components for instructions on how to upgrade your HDP components to 2.4.2.
Note | |
---|---|
The |
You must replace your configuration after upgrading. Copy
/etc/webhcat/conf
from the template to the conf directory in webhcat hosts.Modify the WebHCat configuration files.
Upload Pig, Hive and Sqoop tarballs to HDFS as the $HDFS_User (in this example, hdfs):
su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.4.2.0-258/pig/" su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.4.2.0-258/hive/" su - hdfs -c "hdfs dfs -mkdir -p /hdp/apps/2.4.2.0-258/sqoop/" su - hdfs -c "hdfs dfs -put /usr/hdp/2.4.2.0-258/pig/pig.tar.gz /hdp/apps/2.4.2.0-258/pig/" su - hdfs -c "hdfs dfs -put /usr/hdp/2.4.2.0-258/hive/hive.tar.gz /hdp/apps/2.4.2.0-258/hive/" su - hdfs -c "hdfs dfs -put /usr/hdp/2.4.2.0-258/sqoop/sqoop.tar.gz /hdp/apps/2.4.2.0-258/sqoop/" su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.4.2.0-258/pig" su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.4.2.0-258/pig/pig.tar.gz" su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.4.2.0-258/hive" su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.4.2.0-258/hive/hive.tar.gz" su - hdfs -c "hdfs dfs -chmod -R 555 /hdp/apps/2.4.2.0-258/sqoop" su - hdfs -c "hdfs dfs -chmod -R 444 /hdp/apps/2.4.2.0-258/sqoop/sqoop.tar.gz" su - hdfs -c "hdfs dfs -chown -R hdfs:hadoop /hdp"
Update the following properties in the webhcat-site.xml configuration file, as their values have changed:
<property> <name>templeton.pig.archive</name> <value>hdfs:///hdp/apps/${hdp.version}/pig/pig.tar.gz</value> </property> <property> <name>templeton.hive.archive</name> <value>hdfs:///hdp/apps/${hdp.version}/hive/hive.tar.gz</value> </property> <property> <name>templeton.streaming.jar</name> <value>hdfs:///hdp/apps/${hdp.version}/mapreduce/ hadoop-streaming.jar</value> <description>The hdfs path to the Hadoop streaming jar file.</description> </property> <property> <name>templeton.sqoop.archive</name> <value>hdfs:///hdp/apps/${hdp.version}/sqoop/sqoop.tar.gz</value> <description>The path to the Sqoop archive.</description> </property> <property> <name>templeton.sqoop.path</name> <value>sqoop.tar.gz/sqoop/bin/sqoop</value> <description>The path to the Sqoop executable.</description> </property> <property> <name>templeton.sqoop.home</name> <value>sqoop.tar.gz/sqoop</value> <description>The path to the Sqoop home in the exploded archive. </description> </property>
Note You do not need to modify ${hdp.version}.
Add the following property if it is not present in webhcat-sitemxml:
<property> <name>templeton.libjars</name> <value>/usr/hdp/current/zookeeper-client/zookeeper.jar,/usr/hdp/current/hive-client/lib/hive-common.jar</value> <description>Jars to add the classpath.</description> </property>
Remove the following obsolete properties from webhcat-site.xml:
<property> <name>templeton.controller.map.mem</name> <value>1600</value> <description>Total virtual memory available to map tasks.</description> </property> <property> <name>hive.metastore.warehouse.dir</name> <value>/path/to/warehouse/dir</value> </property>
Add new proxy users, if needed. In core-site.xml, make sure the following properties are also set to allow WebHCat to impersonate your additional HDP 2.4.2 groups and hosts:
<property> <name>hadoop.proxyuser.hcat.groups</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.hcat.hosts</name> <value>*</value> </property>
Where:
hadoop.proxyuser.hcat.group
Is a comma-separated list of the Unix groups whose users may be impersonated by 'hcat'.
hadoop.proxyuser.hcat.hosts
A comma-separated list of the hosts which are allowed to submit requests by 'hcat'.
Start WebHCat:
sudo su -c "usr/hdp/current/hive-webhcat/sbin/webhcat_server.sh start" hcat
Smoke test WebHCat.
If you have a non-secure cluster, on the WebHCat host machine, run the following command to check the status of WebHCat server:
curl http://$WEBHCAT_HOST_MACHINE:50111/templeton/v1/status
You should see the following return status:
"status":"ok","version":"v1"
If you are using a Kerberos secure cluster, run the following command:
curl --negotiate -u: http://$WEBHCAT_HOST_MACHINE:50111/templeton/v1/status
You should see the following return status
{"status":"ok","version":"v1"}[machine@acme]$