Building the project and upload the JAR

You compile the UDF code into a JAR and add the JAR to the classpath on the cluster. You choose one of several methods of configuring the cluster so Hive can find the JAR.

Cloudera Private Cloud Base
Use one of these methods to configure the cluster to find the JAR:
  • Direct reference

    Straight-forward, but recommended for development only.

  • Hive aux library directory

    Prevents accidental overwriting of files or functions. Recommended for tested, stable UDFs to prevent accidental overwriting of files or functions.

  • Reloadable aux JAR

    Avoids HiveServer restarts. Recommended if you anticipate making frequent changes to the UDF logic.

CDP Public Cloud
Use the Direct reference method only.
  1. Build the IntelliJ project.
    ...
    [INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
    [INFO] ------------------------------------------------------------------------
    [INFO] BUILD SUCCESS
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 14.820 s
    [INFO] Finished at: 2019-04-03T16:53:04-07:00
    [INFO] Final Memory: 26M/397M
    [INFO] ------------------------------------------------------------------------
                        
    Process finished with exit code 0
  2. In IntelliJ, navigate to the JAR in the /target directory of the project.
  3. Configure the cluster so Hive can find the JAR using one of the following methods.
    • Direct JAR reference
      1. Upload the JAR to HDFS (Cloudera Private Cloud Base) or S3 (CDP Public Cloud).
      2. Move the JAR into the Hive warehouse. For example, in Cloudera Private Cloud Base:
        $ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
    • Hive aux JARs path (Cloudera Private Cloud Base only)
      1. In Cloudera Private Cloud Base, click Cloudera Manager > Clusters and select the HIVE. Click Configuration and search for Hive Auxiliary JARs Directory.
      2. Specify a directory value for the Hive aux JARs property if necessary, or make a note of the path.
      3. Upload the JAR to the specified directory on all Hive metastore instances.
      4. Click Cloudera Manager > Clusters and select the HIVE-ON-TEZ. Click Configuration and search for Hive Auxiliary JARs Directory.
      5. Upload the JAR to the specified directory on all HiveServer instances.
    • Reloadable aux JAR (Cloudera Private Cloud Base only)
      1. Upload the JAR to the /hadoop/hive-udf-dyn directory on all HiveServer instances (and all Metastore instances, if separate). An HDFS location is not supported.
      2. In hive-site.xml, set the following property: hive.reloadable.aux.jars.path=/hadoop/hive-udf-dyn.
  4. In IntelliJ, click Save.
  5. Click Actions > Deploy Client Configuration.
  6. Restart the Hive service. For example, restart HIVE.