Building the project and upload the JAR

You compile the UDF code into a JAR and add the JAR to the classpath on the cluster. You need to use a direct reference or configure the cluster for Hive to find the JAR.

  • CDP Private Cloud Base: Use one of these methods to configure the cluster to find the JAR:
    • Direct reference

      Straight-forward, but recommended for development only.

    • Hive aux library directory method

      Prevents accidental overwriting of files or functions. Recommended for tested, stable UDFs to prevent accidental overwriting of files or functions.

  • CDP Public Cloud: Use the following method:
    • Direct reference only
  1. Build the IntelliJ project.
    [INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
    [INFO] ------------------------------------------------------------------------
    [INFO] ------------------------------------------------------------------------
    [INFO] Total time: 14.820 s
    [INFO] Finished at: 2019-04-03T16:53:04-07:00
    [INFO] Final Memory: 26M/397M
    [INFO] ------------------------------------------------------------------------
    Process finished with exit code 0
  2. In IntelliJ, navigate to the JAR in the /target directory of the project.
  3. In CDP Private Cloud Base, configure the cluster so Hive can find the JAR using one of the following methods.
    • Direct JAR reference
      1. Upload the JAR to HDFS (CDP Private Cloud Base) or S3 (CDP Public Cloud).
      2. Move the JAR into the Hive warehouse. For example, in CDP Data Center:
        $ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
    • Hive aux JARs path
      1. In CDP Private Cloud Base, click Cloudera Manager > Clusters and select the Hive service, for example, HIVE. Click Configuration and search for Hive Auxiliary JARs Directory.
      2. Specify a directory value for the Hive Aux JARs property if necessary, or make a note of the path.
      3. Upload the JAR to the specified directory on all HiveServer instances (and all Metastore instances, if separate).
  4. In IntelliJ, click Save.
  5. Click Actions > Deploy Client Configuration.
  6. Restart the Hive service. For example restart HIVE.