Using Apache HivePDF version

Build the project and upload the JAR

You compile the UDF code into a JAR and add the JAR to the classpath on the cluster. You choose one of several methods of configuring the cluster so Hive can find the JAR.

CDP Private Cloud Base
Use one of these methods to configure the cluster to find the JAR:
  • Direct reference

    Straight-forward, but recommended for development only.

  • Hive aux library directory

    Prevents accidental overwriting of files or functions. Recommended for tested, stable UDFs to prevent accidental overwriting of files or functions.

  • Reloadable aux JAR

    Avoids HiveServer restarts. Recommended if you anticipate making frequent changes to the UDF logic.

CDP Public Cloud
Uset the Direct reference method only.
  1. Build the IntelliJ project.
    ...
    [INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 14.820 s
    [INFO] Finished at: 2019-04-03T16:53:04-07:00
    [INFO] Final Memory: 26M/397M
    [INFO] ------------------------------------------------------------------------
                        
    Process finished with exit code 0
  2. In IntelliJ, navigate to the JAR in the /target directory of the project.
  3. Configure the cluster so Hive can find the JAR using one of the following methods.
    • Direct JAR reference
      1. Upload the JAR to HDFS (CDP Private Cloud Base) or S3 (CDP Public Cloud).
      2. Move the JAR into the Hive warehouse. For example, in CDP Private Cloud Base:
        $ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
    • Hive aux JARs path (CDP Private Cloud Base only)
      1. In CDP Private Cloud Base, click Cloudera Manager > Clusters and select the Hive service, for example, HIVE. Click Configuration and search for Hive Auxiliary JARs Directory.
      2. Specify a directory value for the Hive aux JARs property if necessary, or make a note of the path.
      3. Upload the JAR to the specified directory on all HiveServer instances (and all Metastore instances, if separate).
    • Reloadable aux JAR (CDP Private Cloud Base only)
      1. Upload the JAR to the /hadoop/hive-udf-dyn directory on all HiveServer instances (and all Metastore instances, if separate). An HDFS location is not supported.
      2. In hive-site.xml, set the following property: hive.reloadable.aux.jars.path=/hadoop/hive-udf-dyn.
  4. In IntelliJ, click Save.
  5. Click Actions > Deploy Client Configuration.
  6. Restart the Hive service. For example, restart HIVE.