Integrating Apache Hive with Apache Spark and BIPDF version

Connecting Hive to BI tools using a JDBC/ODBC driver

To query, analyze, and visualize data stored in Data Hub or in the CDP Private Cloud Base using drivers provided by Cloudera, you connect Apache Hive to Business Intelligence (BI) tools.

How you connect to Hive depends on a number of factors: the location of Hive inside or outside the cluster, the HiveServer deployment, the type of transport, transport-layer security, and authentication. HiveServer is the server interface that enables remote clients to execute queries against Hive and retrieve the results using a JDBC or ODBC connection.
  • Choose a Hive authorization model.
  • Configure authenticated users for querying Hive through JDBC or ODBC driver. For example, set up a Ranger policy.
  1. Obtain the Hive database driver in one of the following ways:
    • For an ODBC connection: Get the Cloudera ODBC driver from the Cloudera Downloads page.
    • For a JDBC connection in CDP Private Cloud Base: Download and extract the Cloudera Hive JDBC driver from the Cloudera Downloads page.
    • For a JDBC connection in CDP Public Cloud: Using the CDW service, in a Virtual Warehouse in the CDW service, select Hive, and from the more options menu, click Download JDBC JAR to download to Apache Hive JDBC jar.

      For a JDBC connection in Data Hub, download and extract the Cloudera JDBC driver from the Cloudera Downloads page.

  2. Depending on the type of driver you obtain, proceed as follows:
    • ODBC driver: follow instructions on the ODBC driver download site, and skip the rest of the steps in this procedure.
    • JDBC driver: add the driver to the classpath of your JDBC client, such as Tableau. For example, check the client documentation about where to put the driver.
  3. Find the JDBC URL for HiveServer using one of a number methods. For example:
    • Using the CDW service in a Virtual Warehouse, from the options menu of your Virtual Warehouse, click Copy JDBC URL.
    • In Cloudera Manager (CM), click Clusters > Hive click Actions, and select Download Client Configuration.
      Unpack hive_on_tez-clientconfig.zip, open beeline-site.xml, and copy the value of beeline.hs2.jdbc.url.hive_on_tez . This value is the JDBC URL. For example
      jdbc:hive2://my_hiveserver.com:2181/;serviceDiscoveryMode=zooKeeper; \
         zooKeeperNamespace=hiveserver2
  4. In the BI tool, such as Tableau, configure the JDBC connection using the JDBC URL and driver class name, com.cloudera.hive.jdbc.HS2Driver.