HDP-2.3.4.7 Release Notes
Also available as:
PDF

Configuring Pig Scripts to Use HCatalog in Oozie Workflows

To access HCatalog with a Pig action in an Oozie workflow, you need to modify configuration information to point to the Hive metastore URIs.

There are two methods for providing this configuration information. Which method you use depends upon how often your Pig scripts access the HCatalog.

Configuring Individual Pig Actions to Access HCatalog

If only a few individual Pig actions access HCatalog, do the following:

  1. Identify the URI (host and port) for the Thrift metastore server.

    1. In Ambari, click Hive > Configs > Advanced.

    2. Make note of the URI in the hive.metastore.uris field in the General section.

      This information is also stored in the hive.default.xml file.

  2. Add the following two properties to the <configuration> elements in each Pig action.

    [Note]Note

    Replace [host:port(default:9083)] in the example below with the host and port for the Thrift metastore server.

    <configuration>
        <property>
            <name>hive.metastore.uris</name>
            <value>thrift://[host:port(default:9083)]</value>
            <description>A comma separated list of metastore uris the client can use to contact the
            metastore server.</description>
        </property>
        <property>
            <name>oozie.action.sharelib.for.pig</name>
            <value>pig,hive,hcatalog</value>
            <description>A comma separated list of libraries to be used by the Pig action.</description>
        </property>
    </configuration>
    

Configuring All Pig Actions to Access HCatalog

If all of your Pig actions access HCatalog, do the following:

  1. Add the following line to the job.properties files, located in your working directory:

    oozie.action.sharelib.for.pig=pig,hive,hcatalog
    <!-- A comma separated list of libraries to be used by the Pig action.-->
    
  2. Identify the URI (host and port) for the Thrift metastore server.

    1. In Ambari, click Hive > Configs > Advanced.

    2. Make note of the URI in the hive.metastore.uris field in the General section.

      This information is also stored in the hive.default.xml file.

  3. Add the following property to the <configuration> elements in each Pig action.

    [Note]Note

    Replace [host:port(default:9083)] in the example below with the host and port for the Thrift metastore server.

    <configuration>
        <property>
            <name>hive.metastore.uris</name>
            <value>thrift://[host:port(default:9083)]</value>
            <description>A comma separated list of metastore uris the client can use to contact the
            metastore server.</description>
        </property>
        </configuration>