Uploading additional JARs to CDW

You add additional Java Archive (JAR) files to the Cloudera Data Warehouse (CDW) Hive classpath that might be required to support dependency JARs, third-party Serde, or any Hive extensions.

  • The JARs are added to the end of the Hive classpath and do not override the Hive JARs.
  • Cloudera recommends that you do not use this procedure to add User-Defined Function (UDF) JARs. If you do, then you must restart HiveServer2 or reload the UDF. For more information about reloading functions, see the Hive Data Definition Language (DDL) manual.
You have the EnvironmentAdmin role permissions to upload the JAR to your object storage.
  1. Build the archive file.
    The archive file can be either a .jar file or a tar.gz file. For a tar.gz archive file, only JARs present in the top level are considered.

    For example, if the tar.gz file contains these files — test1.jar, test2.jar, and deps\test3.jar, only test1.jar and test2.jar are considered; deps\test3.jar is excluded.

  2. Upload the archive file to the Hive Virtual Warehouse on CDW object storage, such as AWS S3 or Azure Data Lake Storage (ADLS).
  3. Log in to the CDW service and from the Overview page, locate the Hive Virtual Warehouse that uses the bucket or container where you placed the archive file, and click and select Edit.
  4. In the Virtual Warehouse Details page, click Configurations > Hiveserver2.
  5. From the Configuration files drop-down list, select env.
  6. Search for CDW_HIVE_AUX_JARS_PATH and add the archive file to the environment variable.

    Configuration to specify Hive JAR paths
    If you add a directory, the .jar or tar.gz files within the directory are copied and extracted. For a tar.gz file, only the JARs present in the top level are copied.
    Consider the following JAR path - /common-jars/common-jars.tar.gz:/common-jars/single-jar.jar:/serde-specific-jar/serde.jar. In this example, common-jars.tar.gz is extracted and single-jar.jar and serde.jar files are copied.
  7. Repeat the previous step and add the archive file or directory for Query coordinator and Query executor.
    If the CDW_HIVE_AUX_JARS_PATH environment variable is not present, click and add the following custom configuration:
    CDW_HIVE_AUX_JARS_PATH=[***VALUE***]
  8. If you require the JARs for Hive metastore (HMS), go to the corresponding Database Catalog and add the environment variable in Configurations > Metastore.
On saving the configuration, Hive Virtual Warehouse restarts and the archive files are available and added to the end of the Hive classpath.