You compile the UDF code into a JAR and add the JAR to the classpath on the cluster.
You need to use a direct reference or configure the cluster for Hive to find the
JAR.
- CDP Private Cloud Base: Use one of these methods to configure the cluster to
find the JAR:
- Direct reference
Straight-forward, but recommended for development
only.
- Hive aux library directory method
Prevents accidental overwriting of
files or functions. Recommended for tested, stable UDFs to prevent
accidental overwriting of files or functions.
- CDP Public Cloud: Use the following method:
-
Build the IntelliJ project.
...
[INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14.820 s
[INFO] Finished at: 2019-04-03T16:53:04-07:00
[INFO] Final Memory: 26M/397M
[INFO] ------------------------------------------------------------------------
Process finished with exit code 0
-
In IntelliJ, navigate to the JAR in the /target directory of the project.
-
In CDP Private Cloud Base, configure the cluster so Hive can find the JAR using one of
the following methods.
- Direct JAR reference
- Upload the JAR to HDFS (CDP Private Cloud Base) or S3 (CDP Public Cloud).
- Move the JAR into the Hive warehouse. For example, in CDP Data
Center:
$ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
- Hive aux JARs path
- In CDP Private Cloud Base, click and select the Hive service, for example, HIVE. Click
Configuration and search for Hive
Auxiliary JARs Directory.
- Specify a directory value for the Hive Aux JARs property if
necessary, or make a note of the path.
- Upload the JAR to the specified directory on all HiveServer
instances (and all Metastore instances, if separate).
-
In IntelliJ, click Save.
-
Click .
-
Restart the Hive service. For example restart HIVE.