You compile the UDF code into a JAR and add the JAR to the classpath on the cluster.
You choose one of several methods of configuring the cluster so Hive can find the
JAR.
- Cloudera Private Cloud Base
- Use one of these methods to configure the cluster to find the JAR:
- Direct reference
Straight-forward, but recommended for
development only.
- Hive aux library directory
Prevents accidental overwriting of
files or functions. Recommended for tested, stable UDFs to
prevent accidental overwriting of files or functions.
- Reloadable aux JAR
Avoids HiveServer restarts. Recommended if you
anticipate making frequent changes to the UDF logic.
- CDP Public Cloud
- Use the Direct reference method only.
-
Build the IntelliJ project.
...
[INFO] Building jar: /Users/max/IdeaProjects/hiveudf/target/TypeOf-1.0-SNAPSHOT.jar
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 14.820 s
[INFO] Finished at: 2019-04-03T16:53:04-07:00
[INFO] Final Memory: 26M/397M
[INFO] ------------------------------------------------------------------------
Process finished with exit code 0
-
In IntelliJ, navigate to the JAR in the /target directory of the project.
-
Configure the cluster so Hive can find the JAR using one of the following
methods.
- Direct JAR reference
- Upload the JAR to HDFS (Cloudera Private Cloud Base) or S3 (CDP Public Cloud).
- Move the JAR into the Hive warehouse. For example, in Cloudera Private Cloud Base:
$ hdfs dfs -put TypeOf-1.0-SNAPSHOT.jar /warehouse/tablespace/managed/hiveudf-1.0-SNAPSHOT.jar
- Hive aux JARs path (Cloudera Private Cloud Base
only)
- In Cloudera Private Cloud Base, click and select the Hive service, for example, HIVE. Click
Configuration and search for Hive
Auxiliary JARs Directory.
- Specify a directory value for the Hive aux JARs property if
necessary, or make a note of the path.
- Upload the JAR to the specified directory on all HiveServer
instances (and all Metastore instances, if separate).
- Reloadable aux JAR (Cloudera Private Cloud Base
only)
- Upload the JAR to the
/hadoop/hive-udf-dyn
directory on all HiveServer instances (and all Metastore instances,
if separate). An HDFS location is not supported.
- In hive-site.xml, set the following property:
hive.reloadable.aux.jars.path=/hadoop/hive-udf-dyn
.
-
In IntelliJ, click Save.
-
Click .
-
Restart the Hive service. For example, restart HIVE.