Using Apache Hive
Also available as:
PDF

Build the project and upload the JAR

You compile the UDF code into a JAR and place the JAR on the cluster. You choose one of several methods of configuring the cluster so Hive can find the JAR.

In this task, you choose one of several methods for configuring the cluster to find the JAR:
  • Direct reference

    Straight-forward, but recommended for development only.

  • Hive aux library directory method

    Prevents accidental overwriting of files or functions. Recommended for tested, stable UDFs to prevent accidental overwriting of files or functions.

  • Reloadable aux JAR Avoids HiveServer restarts.

    Recommended if you anticipate making frequent changes to the UDF logic.

  1. Build the IntelliJ project.
    ...
    [INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 14.820 s
    [INFO] Finished at: 2019-04-03T16:53:04-07:00
    [INFO] Final Memory: 26M/397M
    [INFO] ------------------------------------------------------------------------
                        
    Process finished with exit code 0
  2. Navigate to the JAR in the /target directory of the project.
  3. Configure the cluster so Hive can find the JAR using one of the following methods.
    • Direct JAR reference
      1. Upload the JAR to the cluster.
      2. Move the JAR into the Hive warehouse on HDFS. For example:
      $ sudo su - hdfs
      
      $ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
    • Reloadable Aux JAR
      1. Upload the JAR to the /hadoop/hive-udf-dyn directory on all HiveServer instances (and all Metastore instances, if separate). An HDFS location is not supported.
      2. In hive-site.xml, set the following property: hive.reloadable.aux.jars.path=/hadoop/hive-udf-dyn.
    • Hive aux JARs path
      1. Create a external (outside HDFS) directory on the cluster, /usr/hdp/3.1.0.0-78/hive/auxlib for example.
      2. Create a symbolic link to the external directory. For example: ln -s /local-apps/hive-udf-aux /usr/hdp/3.1.0.0-78/hive/auxlib

        Hive automatically picks up JARS from ${HIVE_HOME}/auxlib which does not exist by default. As the ${HIVE_HOME} is version dependent, do not create the auxlib directory under the binary location, but instead, create a symbolic link that will survive versioning.