Impala Client Access

Application developers have a number of options to interface with Impala. The core development language with Impala is SQL. You can also use Java or other languages to interact with Impala through the standard JDBC and ODBC interfaces used by many business intelligence tools. For specialized kinds of analysis, you can supplement the Impala built-in functions by writing user-defined functions in C++ or Java.

You can connect and submit requests to the Impala through:

  • The impala-shell interactive command interpreter
  • The Hue web-based user interface
  • JDBC
  • ODBC

Each impalad daemon process, running on separate nodes in a cluster, listens to several ports for incoming requests:

  • Requests from impala-shell and Hue are routed to the impalad daemons through the same port.
  • The impalad daemons listen on separate ports for JDBC and ODBC requests.

Impala Startup Options for Client Connections

The following options control client connections to Impala.

--fe_service_threads
Specifies the maximum number of concurrent client connections allowed. The default value is 64 with which 64 queries can run simultaneously.

If you have more clients trying to connect to Impala than the value of this setting, the later arriving clients have to wait for the duration specified by --accepted_client_cnxn_timeout. You can increase this value to allow more client connections. However, a large value means more threads to be maintained even if most of the connections are idle, and it could negatively impact query latency. Client applications should use the connection pool to avoid need for large number of sessions.

--accepted_client_cnxn_timeout
Controls how Impala treats new connection requests if it has run out of the number of threads configured by --fe_service_threads

If --accepted_client_cnxn_timeout > 0, new connection requests are rejected if Impala can't get a server thread within the specified (in seconds) timeout.

If --accepted_client_cnxn_timeout=0, i.e. no timeout, clients wait indefinitely to open the new session until more threads are available.

The default timeout is 5 minutes.

The timeout applies only to client facing thrift servers, i.e., HS2 and Beeswax servers.