Using Apache HivePDF version

Set up the development environment

You can create a Hive UDF in a development environment using IntelliJ, for example, and build the UDF with Hive and Hadoop JARS that you download from your Cloudera cluster.

  1. On your cluster, locate the hadoop-common-<version>.jar and hive-exec-<version>.jar.
    For example:
    ls /opt/cloudera/parcels/CDH-7.0.0-* |grep -v test
    /opt/cloudera/parcels/CDH-7. . ..jar 
  2. Download the JARs to your development computer to add to your IntelliJ project later.
  3. Open IntelliJ and create a new Maven-based project. Click Create New Project. Select Maven and the supported Java version as the Project SDK. Click Next.
  4. Add archetype information.
    For example:
    • GroupId: com.mycompany.hiveudf
    • ArtifactId: hiveudf
  5. Click Next and Finish.
    The generated pom.xml appears in sample-hiveudf.
  6. To the pom.xml, add properties to facilitate versioning.
    For example:
    <properties>
       <hadoop.version>TBD</hadoop.version>
       <hive.version>TBD</hive.version>
    </properties>
  7. In the pom.xml, define the repositories.
    Use internal repositories if you do not have internet access.
  8. Define dependencies.
    For example:
    <dependencies>
      <dependency>
           <groupId>org.apache.hive</groupId>
           <artifactId>hive-exec</artifactId>
           <version>${hive.version}</version>
        </dependency>
        <dependency>
           <groupId>org.apache.hadoop</groupId>
           <artifactId>hadoop-common</artifactId>
           <version>${hadoop.version}</version>
        </dependency>
    </dependencies>                                  
  9. Select File > Project Structure. Click Modules. On the Dependencies tab, click + to add JARS or directories. Browse to and select the JARs you downloaded in step 1.

We want your opinion

How can we improve this page?

What kind of feedback do you have?