This is the documentation for CDH 5.0.x. Documentation for other versions is available at Cloudera Documentation.

Configuring Impala to Work with JDBC

Impala supports JDBC integration. The JDBC driver allows you to access Impala from a Java program that you write, or a Business Intelligence or similar tool that uses JDBC to communicate with various database products. Setting up a JDBC connection to Impala involves the following steps:

Specifying an available communication port. See Configuring the JDBC Port.
Installing the JDBC driver on every system that runs the JDBC-enabled application. See Enabling Impala JDBC Support on Client Systems.
Specifying a connection string for the JDBC application to access one of the servers running the impalad daemon, with the appropriate security settings. See Establishing JDBC Connections.

Configuring the JDBC Port

The default JDBC 2.0 port is 21050; Impala server accepts JDBC connections through this same port 21050 by default. Make sure this port is available for communication with other hosts on your network, for example, that it is not blocked by firewall software. If your JDBC client software connects to a different port, specify that alternative port number with the --hs2_port option when starting impalad. See Starting Impala for details.

Enabling Impala JDBC Support on Client Systems

The Impala JDBC integration is made possible by a client-side JDBC driver, which is contained in JAR files within a zip file. Download this zip file to each client machine that will use JDBC with Impala.

To enable JDBC support for Impala on the system where you run the JDBC application:

Download the Impala JDBC zip file to the client machine that you will use to connect to Impala servers.
Note: For Maven users, see this sample github page for an example of the dependencies you could add to a pom file instead of downloading the individual JARs.
Extract the contents of the zip file to a location of your choosing. For example:
- On Linux, you might extract this to a location such as /opt/jars/.
- On Windows, you might extract this to a subdirectory of C:\Program Files.
To successfully load the Impala JDBC driver, client programs must be able to locate the associated JAR files. This often means setting the CLASSPATH for the client process to include the JARs. Consult the documentation for your JDBC client for more details on how to install new JDBC drivers, but some examples of how to set CLASSPATH variables include:
- On Linux, if you extracted the JARs to /opt/jars/, you might issue the following command to prepend the JAR files path to an existing classpath:
```
export CLASSPATH=/opt/jars/*.jar:$CLASSPATH
```
- On Windows, use the System Properties control panel item to modify the Environment Variables for your system. Modify the environment variables to include the path to which you extracted the files.
  Note: If the existing CLASSPATH on your client machine refers to some older version of the Hive JARs, ensure that the new JARs are the first ones listed. Either put the new JAR files earlier in the listings, or delete the other references to Hive JAR files.

Establishing JDBC Connections

The JDBC driver class is org.apache.hive.jdbc.HiveDriver. Once you have configured Impala to work with JDBC, you can establish connections between the two. To do so for a cluster that does not use Kerberos authentication, use a connection string of the form jdbc:hive2://host:port/;auth=noSasl. For example, you might use:

jdbc:hive2://myhost.example.com:21050/;auth=noSasl

To connect to an instance of Impala that requires Kerberos authentication, use a connection string of the form jdbc:hive2://host:port/;principal=principal_name. The principal must be the same user principal you used when starting Impala. For example, you might use:

jdbc:hive2://myhost.example.com:21050/;principal=impala/myhost.example.com@H2.EXAMPLE.COM

Page generated September 3, 2015.