SET Statement
Specifies values for query options that control the runtime behavior of other statements within the same session.
In CDH 5.7 / Impala 2.5 and higher, SET also defines user-specified substitution variables for the impala-shell interpreter. This feature uses the SET command built into impala-shell instead of the SQL SET statement. Therefore the substitution mechanism only works with queries processed by impala-shell, not with queries submitted through JDBC or ODBC.
Syntax:
SET [query_option=option_value]
SET with no arguments returns a result set consisting of all available query options and their current values.
The query option name and any string argument values are case-insensitive.
Each query option has a specific allowed notation for its arguments. Boolean options can be enabled and disabled by assigning values of either true and false, or 1 and 0. Some numeric options accept a final character signifying the unit, such as 2g for 2 gigabytes or 100m for 100 megabytes. See Query Options for the SET Statement for the details of each query option.
User-specified substitution variables:
In CDH 5.7 / Impala 2.5 and higher, you can specify your own names and string substitution values within the impala-shell interpreter. Once a substitution variable is set up, its value is inserted into any SQL statement in that same impala-shell session that contains the notation ${var:varname}. Using SET in an interactive impala-shell session overrides any value for that same variable passed in through the --var=varname=value command-line option.
For example, to set up some default parameters for report queries, but then override those default within an impala-shell session, you might issue commands and statements such as the following:
-- Initial setup for this example. create table staging_table (s string); insert into staging_table values ('foo'), ('bar'), ('bletch'); create table production_table (s string); insert into production_table values ('North America'), ('EMEA'), ('Asia'); quit; -- Start impala-shell with user-specified substitution variables, -- run a query, then override the variables with SET and run the query again. $ impala-shell --var=table_name=staging_table --var=cutoff=2 ... banner message ... [localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff}; Query: select s from staging_table order by s limit 2 +--------+ | s | +--------+ | bar | | bletch | +--------+ Fetched 2 row(s) in 1.06s [localhost:21000] > set var:table_name=production_table; Variable TABLE_NAME set to production_table [localhost:21000] > set var:cutoff=3; Variable CUTOFF set to 3 [localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff}; Query: select s from production_table order by s limit 3 +---------------+ | s | +---------------+ | Asia | | EMEA | | North America | +---------------+
The following example shows how SET with no parameters displays all user-specified substitution variables, and how UNSET removes the substitution variable entirely:
[localhost:21000] > set; Query options (defaults shown in []): ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0] ... V_CPU_CORES: [0] Shell Options LIVE_PROGRESS: False LIVE_SUMMARY: False Variables: CUTOFF: 3 TABLE_NAME: staging_table [localhost:21000] > unset var:cutoff; Unsetting variable CUTOFF [localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff}; Error: Unknown variable CUTOFF
See Running Commands and SQL Statements in impala-shell for more examples of using the --var, SET, and ${var:varname} substitution technique in impala-shell.
Usage notes:
MEM_LIMIT is probably the most commonly used query option. You can specify a high value to allow a resource-intensive query to complete. For testing how queries would work on memory-constrained systems, you might specify an artificially low value.
Complex type considerations:
Examples:
The following example sets some numeric and some Boolean query options to control usage of memory, disk space, and timeout periods, then runs a query whose success could depend on the options in effect:
set mem_limit=64g; set DISABLE_UNSAFE_SPILLS=true; set parquet_file_size=400m; set RESERVATION_REQUEST_TIMEOUT=900000; insert overwrite parquet_table select c1, c2, count(c3) from text_table group by c1, c2, c3;
Added in: CDH 5.2.0 (Impala 2.0.0)
SET has always been available as an impala-shell command. Promoting it to a SQL statement lets you use this feature in client applications through the JDBC and ODBC APIs.
Cancellation: Cannot be cancelled.
HDFS permissions: This statement does not touch any HDFS files or directories, therefore no HDFS permissions are required.
Related information:
See Query Options for the SET Statement for the query options you can adjust using this statement.