Integrating Apache Hive with Spark and BI
Also available as:
PDF

Specify the JDBC connection string

You construct a JDBC URL to connect Hive to a BI tool.

In embedded mode, HiveServer runs within the Hive client, not as a separate process. Consequently, the URL does not need a host or port number to make the JDBC connection. In remote mode, the URL must include a host and port number because HiveServer runs as a separate process on the host and port you specify. The JDBC client and HiveServer interact using remote procedure calls using the Thrift protocol. If HiveServer is configured in remote mode, the JDBC client and HiveServer can use either HTTP or TCP-based transport to exchange RPC messages.
  1. Create a minimal JDBC connection string for connecting Hive to a BI tool.
    • Embedded mode: Create the JDBC connection string for connecting to Hive in embedded mode.
    • Remote mode: Create a JDBC connection string for making an unauthenticated connection to the Hive default database on the localhost port 10000.
    Embedded mode: "jdbc:hive://"
    Remote mode: "jdbc:hive://myserver:10000/default", "", "");
  2. Modify the connection string to change the transport mode from TCP (the default) to HTTP using the transportMode and httpPath session configuration variables.
    jdbc:hive2://myserver:10000/default;transportMode=http;httpPath=myendpoint.com;
    You need to specify httpPath when using the HTTP transport mode. <http_endpoint> has a corresponding HTTP endpoint configured in hive-site.xml.
  3. Add parameters to the connection string for Kerberos Authentication.
    jdbc:hive2://myserver:10000/default;principal=prin.dom.com@APRINCIPAL.DOM.COM