Configuring Hive Warehouse Connector

Learn about how to configure Hive Warehouse Connector (HWC) to be used with Cloudera Data Engineering.

You can set up a connection between Cloudera Data Engineering and HWC through Cloudera Data Warehouse or Cloudera Data Hub.

Steps

Make sure Hive is in running state in Cloudera Data Warehouse or in Cloudera Data Hub, respectively.
Configure the read and write mode.
- To configure the read mode, see Hive Warehouse Connector read modes.
- To configure the write mode, see Hive Warehouse Connector write modes.
(Optional) Configure HS2.
note
Perform this step if your configured read or write mode requires HS2 connection. HS2 is needed for all read modes and for the HIVE_WAREHOUSE_CONNECTOR write mode.
Based on your infrastructure setup, you can use one of these methods to establish a connection to HS2:
- Cloudera Data Hub cluster
- Cloudera Data Warehouse data service

Cloudera Data Hub cluster
Cloudera Data Warehouse data service

Cloudera Data Engineering can connect to the HS2 provided by the Cloudera Data Hub cluster. It can also enforce the Fine-Grained Access Control (FGAC) policies through Ranger.

You can set up the connection to HS2 using the spark.sql.hive.hiveserver2.jdbc.url Cloudera Data Engineering job or session configuration. You can obtain the JDBC URL from the Hive configuration files in the Cloudera Data Hub cluster.

Example:

spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://dex-dnna1t-master0.dex-a3qn.svbr-nqvp.int.cldr.work:2181/default;httpPath=cliservice;keyStoreType=jks;retries=5;serviceDiscoveryMode=zooKeeper;ssl=true;transportMode=http;trustStoreType=jks;zooKeeperNamespace=hiveserver2;zookeeperKeyStoreType=jks;zookeeperTrustStoreType=jks

Cloudera Data Engineering can connect to HS2 through the Virtual Warehouse of Cloudera Data Warehouse. Fine-Grained Access Control (FGAC) policies can also be enforced through Ranger.

Fetch the JDBC URL from the Cloudera Data Warehouse UI:
1. On the Cloudera Data Warehouse UI, navigate to Overview in the left navigation menu.
2. Select a virtual warehouse.
3. Click the icon.
4. Select Copy JDBC URL from the options displayed.
Set up the connection to HS2 using the spark.sql.hive.hiveserver2.jdbc.url=[***JDBC-URL***] Cloudera Data Engineering job or session configuration. Include in the configuration the JDBC URL that you copied from the Cloudera Data Warehouse UI.
Example:
```
spark.sql.hive.hiveserver2.jdbc.url=jdbc:hive2://hs2-cde-hwc-cdw.dw-dex-6yb7am.svbr-nqvp.int.cldr.work/default;transportMode=http;httpPath=cliservice;socketTimeout=60;ssl=true;user=<user>;password=<password>
```