Configuring Pig Scripts to Use HCatalog in Oozie Workflows
To access HCatalog with a Pig action in an Oozie workflow, you need to modify configuration information to point to the Hive metastore URIs.
There are two methods for providing this configuration information. Which method you use depends upon how often your Pig scripts access the HCatalog.
Configuring Individual Pig Actions to Access HCatalog
If only a few individual Pig actions access HCatalog, do the following:
Identify the URI (host and port) for the Thrift metastore server.
In Ambari, click Hive > Configs > Advanced.
Make note of the URI in the hive.metastore.uris field in the General section.
This information is also stored in the
hive.default.xml
file.
Add the following two properties to the <configuration> elements in each Pig action.
Note Replace
[host:port(default:9083)]
in the example below with the host and port for the Thrift metastore server.<configuration> <property> <name>hive.metastore.uris</name> <value>thrift://
[host:port(default:9083)]
</value> <description>A comma separated list of metastore uris the client can use to contact the metastore server.</description> </property> <property> <name>oozie.action.sharelib.for.pig</name> <value>pig,hive,hcatalog</value> <description>A comma separated list of libraries to be used by the Pig action.</description> </property> </configuration>
Configuring All Pig Actions to Access HCatalog
If all of your Pig actions access HCatalog, do the following:
Add the following line to the
job.properties
files, located in your working directory:oozie.action.sharelib.for.pig=pig,hive,hcatalog <!-- A comma separated list of libraries to be used by the Pig action.-->
Identify the URI (host and port) for the Thrift metastore server.
In Ambari, click Hive > Configs > Advanced.
Make note of the URI in the hive.metastore.uris field in the General section.
This information is also stored in the
hive.default.xml
file.
Add the following property to the <configuration> elements in each Pig action.
Note Replace
[host:port(default:9083)]
in the example below with the host and port for the Thrift metastore server.<configuration> <property> <name>hive.metastore.uris</name> <value>thrift://
[host:port(default:9083)]
</value> <description>A comma separated list of metastore uris the client can use to contact the metastore server.</description> </property> </configuration>