Running Commands and SQL Statements in impala-shell
For information on available commands, see impala-shell Command Reference. You can see the full set of available commands by pressing TAB twice, for example:
[impalad-host:21000] > connect describe explain help history insert quit refresh select set shell show use version [impalad-host:21000] >
For example:
[localhost:21000] > select * > from t1 > limit 5; +---------+-----------+ | s1 | s2 | +---------+-----------+ | hello | world | | goodbye | cleveland | +---------+-----------+
A comment is considered part of the statement it precedes, so when you enter a -- or /* */ comment, you get a continuation prompt until you finish entering a statement ending with a semicolon:
[localhost:21000] > -- This is a test comment > show tables like 't*'; +--------+ | name | +--------+ | t1 | | t2 | | tab1 | | tab2 | | tab3 | | text_t | +--------+
Use the up-arrow and down-arrow keys to cycle through and edit previous commands. impala-shell uses the readline library and so supports a standard set of keyboard shortcuts for editing and cursor movement, such as Ctrl-A for beginning of line and Ctrl-E for end of line.
In CDH 5.7 / Impala 2.5 and higher, you can define substitution variables to be used within SQL statements processed by impala-shell. On the command line, you specify the option --var=variable_name=value. Within an interactive session or a script file processed by the -f option, you specify a SET command using the notation SET VAR:variable_name=value. Within a SQL statement, you substitute the value by using the notation ${var:variable_name}.
For example, here are some impala-shell commands that define substitution variables and then use them in SQL statements executed through the -q and -f options. Notice how the -q argument strings are single-quoted to prevent shell expansion of the ${var:value} notation, and any string literals within the queries are enclosed by double quotation marks.
$ impala-shell --var=tname=table1 --var=colname=x --var=coltype=string -q 'create table ${var:tname} (${var:colname} ${var:coltype}) stored as parquet' Starting Impala Shell without Kerberos authentication Connected to hostname Server version: impalad_version Query: create table table1 (x string) stored as parquet $ NEW_STRING="hello world" $ impala-shell --var=tname=table1 --var=insert_val="$NEW_STRING" -q 'insert into ${var:tname} values ("${var:insert_val}")' Starting Impala Shell without Kerberos authentication Connected to hostname Server version: impalad_version Query: insert into table1 values ("hello world") Inserted 1 row(s) in 1.40s $ for VAL in foo bar bletch do impala-shell --var=tname=table1 --var=insert_val="$VAL" -q 'insert into ${var:tname} values ("${var:insert_val}")' done ... Query: insert into table1 values ("foo") Inserted 1 row(s) in 0.22s Query: insert into table1 values ("bar") Inserted 1 row(s) in 0.11s Query: insert into table1 values ("bletch") Inserted 1 row(s) in 0.21s $ echo "Search for what substring?" ; read answer Search for what substring? b $ impala-shell --var=tname=table1 -q 'select x from ${var:tname} where x like "%${var:answer}%"' Starting Impala Shell without Kerberos authentication Connected to hostname Server version: impalad_version Query: select x from table1 where x like "%b%" +--------+ | x | +--------+ | bletch | | bar | +--------+ Fetched 2 row(s) in 0.83s
Here is a substitution variable passed in by the --var option, and then referenced by statements issued interactively. Then the variable is cleared with the UNSET command, and defined again with the SET command.
$ impala-shell --quiet --var=tname=table1 Starting Impala Shell without Kerberos authentication *********************************************************************************** banner_message *********************************************************************************** [hostname:21000] > select count(*) from ${var:tname}; +----------+ | count(*) | +----------+ | 4 | +----------+ [hostname:21000] > unset var:tname; Unsetting variable TNAME [hostname:21000] > select count(*) from ${var:tname}; Error: Unknown variable TNAME [hostname:21000] > set var:tname=table1; [hostname:21000] > select count(*) from ${var:tname}; +----------+ | count(*) | +----------+ | 4 | +----------+
The following example shows how the SOURCE command can execute a series of statements from a file:
$ cat commands.sql show databases; show tables in default; show functions in _impala_builtins like '*minute*'; $ impala-shell -i localhost ... [localhost:21000] > source commands.sql; Query: show databases +------------------+----------------------------------------------+ | name | comment | +------------------+----------------------------------------------+ | _impala_builtins | System database for Impala builtin functions | | default | Default Hive database | +------------------+----------------------------------------------+ Fetched 2 row(s) in 0.06s Query: show tables in default +-----------+ | name | +-----------+ | customers | | sample_07 | | sample_08 | | web_logs | +-----------+ Fetched 4 row(s) in 0.02s Query: show functions in _impala_builtins like '*minute*' +-------------+--------------------------------+-------------+---------------+ | return type | signature | binary type | is persistent | +-------------+--------------------------------+-------------+---------------+ | INT | minute(TIMESTAMP) | BUILTIN | true | | TIMESTAMP | minutes_add(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | minutes_add(TIMESTAMP, INT) | BUILTIN | true | | TIMESTAMP | minutes_sub(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | minutes_sub(TIMESTAMP, INT) | BUILTIN | true | +-------------+--------------------------------+-------------+---------------+ Fetched 5 row(s) in 0.03s
The following example shows how a file that is run by the SOURCE command, or through the -q or -f options of impala-shell, can contain additional SOURCE commands. The first file, nested1.sql, runs an impala-shell command and then also runs the commands from nested2.sql. This ability for scripts to call each other is often useful for code that sets up schemas for applications or test environments.
$ cat nested1.sql show functions in _impala_builtins like '*minute*'; source nested2.sql $ cat nested2.sql show functions in _impala_builtins like '*hour*' $ impala-shell -i localhost -f nested1.sql Starting Impala Shell without Kerberos authentication Connected to localhost:21000 ... Query: show functions in _impala_builtins like '*minute*' +-------------+--------------------------------+-------------+---------------+ | return type | signature | binary type | is persistent | +-------------+--------------------------------+-------------+---------------+ | INT | minute(TIMESTAMP) | BUILTIN | true | | TIMESTAMP | minutes_add(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | minutes_add(TIMESTAMP, INT) | BUILTIN | true | | TIMESTAMP | minutes_sub(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | minutes_sub(TIMESTAMP, INT) | BUILTIN | true | +-------------+--------------------------------+-------------+---------------+ Fetched 5 row(s) in 0.01s Query: show functions in _impala_builtins like '*hour*' +-------------+------------------------------+-------------+---------------+ | return type | signature | binary type | is persistent | +-------------+------------------------------+-------------+---------------+ | INT | hour(TIMESTAMP) | BUILTIN | true | | TIMESTAMP | hours_add(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | hours_add(TIMESTAMP, INT) | BUILTIN | true | | TIMESTAMP | hours_sub(TIMESTAMP, BIGINT) | BUILTIN | true | | TIMESTAMP | hours_sub(TIMESTAMP, INT) | BUILTIN | true | +-------------+------------------------------+-------------+---------------+ Fetched 5 row(s) in 0.01s