Connecting to Impala Daemon in Impala Shell
In an impala-shell session, you need to connect
to an impalad daemon to issue queries. When you connect
to an impalad
, and that daemon coordinates the execution of
all queries sent to it.
- Through command-line options when you run the impala-shell command.
- Through a configuration file that is read when you run the impala-shell command.
- During an impala-shell session, by issuing a
CONNECT
command.
To connect the Impala shell during shell startup:
- Locate the hostname that is running an instance of the impalad daemon. If that impalad uses a non-default port (something other than port 21000) for impala-shell connections, find out the port number also.
- Use the
-i
option to the impala-shell interpreter to specify the connection information for that instance of impalad:# When you are connecting to an impalad running on the same machine. # The prompt will reflect the current hostname. $ impala-shell # When you are connecting to an impalad running on a remote machine, and impalad is listening # on a non-default port over the HTTP HiveServer2 protocol. $ impala-shell -i some.other.hostname:port_number --protocol='hs2-http' # When you are connecting to an impalad running on a remote machine, and impalad is listening # on a non-default port. $ impala-shell -i some.other.hostname:port_number
To connect to an Impala in theimpala-shell session:
- Start the Impala shell with no connection:
impala-shell
- Locate the hostname that is running the impalad daemon. If that impalad uses a non-default port (something other than port 21000) for impala-shell connections, find out the port number also.
- Use the
connect
command to connect to an Impala instance. Enter a command and replace impalad-host with the hostname you have configured to run Impala in your environment.[Not connected] > connect impalad-host [impalad-host:21000] >
To start impala-shell in a specific database:
You can use all the same connection options as in previous examples. For simplicity, these examples assume that you are logged into one of the Impala daemons.
- Find the name of the database containing the relevant tables, views, and so on that you want to operate on.
- Use the
-d
option to the impala-shell interpreter to connect and immediately switch to the specified database, without the need for aUSE
statement or fully qualified names:# Subsequent queries with unqualified names operate on # tables, views, and so on inside the database named 'staging'. $ impala-shell -i localhost -d staging # It is common during development, ETL, benchmarking, and so on # to have different databases containing the same table names # but with different contents or layouts. $ impala-shell -i localhost -d parquet_snappy_compression $ impala-shell -i localhost -d parquet_gzip_compression
To run one or several statements in non-interactive mode:
You can use all the same connection options as in previous examples. For simplicity, these examples assume that you are logged into one of the Impala daemons.
- Construct a statement, or a file containing a sequence of statements, that you want to run in an automated way, without typing or copying and pasting each time.
- Invoke impala-shell with the
-q
option to run a single statement, or the-f
option to run a sequence of statements from a file. The impala-shell command returns immediately, without going into the interactive interpreter.# A utility command that you might run while developing shell scripts # to manipulate HDFS files. $ impala-shell -i localhost -d database_of_interest -q 'show tables' # A sequence of CREATE TABLE, CREATE VIEW, and similar DDL statements # can go into a file to make the setup process repeatable. $ impala-shell -i localhost -d database_of_interest -f recreate_tables.sql