Using User-Defined Functions (UDFs) with HiveServer2
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
Perform one of the following procedures, depending on whether you have enabled Sentry, to use custom user-defined functions (UDFs) with Hive.
Without Sentry Enabled
With Sentry Enabled
- Copy the UDF JAR file to HDFS.
- Copy the JAR file to the host on which HiveServer2 is running. Save the JARs to any directory you choose, and make a note of the path.
- In the Cloudera Manager Admin Console, go to the Hive service.
- Click the Configuration tab.
- Expand the categories
- Configure the Hive Auxiliary JARs Directory property with the HiveServer2 host path from the Step 2.
- Click Save Changes. The JARs are added to HIVE_AUX_JARS_PATH environment variable.
- Redeploy the Hive client configuration.
- In the Cloudera Manager Admin Console, go to the Hive service.
- From the Actions menu at the top right of the service page, select Deploy Client Configuration.
- Click Deploy Client Configuration.
- Restart the Hive service. If the Hive Auxiliary JARs Directory property is configured but the directory does not exist, HiveServer2 will not start.
- Grant privileges on the JAR files to the roles that require access. You can use the Hive SQL GRANT statement to do so. For example, to grant privileges on the add.jar file:
GRANT ALL ON URI 'hdfs:///tmp/add.jar' TO ROLE EXAMPLE_ROLE
- Run the CREATE FUNCTION command and point to the JAR from Hive. For example:
CREATE FUNCTION addfunc AS 'com.example.hiveserver2.udf.add' USING JAR 'hdfs:///tmp/add.jar'