SET Statement

Specifies values for query options that control the runtime behavior of other statements within the same session.

In CDH 5.7 / Impala 2.5 and higher, SET also defines user-specified substitution variables for the impala-shell interpreter. This feature uses the SET command built into impala-shell instead of the SQL SET statement. Therefore the substitution mechanism only works with queries processed by impala-shell, not with queries submitted through JDBC or ODBC.

Syntax:

SET [query_option=option_value]

SET with no arguments returns a result set consisting of all available query options and their current values.

The query option name and any string argument values are case-insensitive.

Each query option has a specific allowed notation for its arguments. Boolean options can be enabled and disabled by assigning values of either true and false, or 1 and 0. Some numeric options accept a final character signifying the unit, such as 2g for 2 gigabytes or 100m for 100 megabytes. See Query Options for the SET Statement for the details of each query option.

User-specified substitution variables:

In CDH 5.7 / Impala 2.5 and higher, you can specify your own names and string substitution values within the impala-shell interpreter. Once a substitution variable is set up, its value is inserted into any SQL statement in that same impala-shell session that contains the notation ${var:varname}. Using SET in an interactive impala-shell session overrides any value for that same variable passed in through the --var=varname=value command-line option.

For example, to set up some default parameters for report queries, but then override those default within an impala-shell session, you might issue commands and statements such as the following:

-- Initial setup for this example.
create table staging_table (s string);
insert into staging_table values ('foo'), ('bar'), ('bletch');

create table production_table (s string);
insert into production_table values ('North America'), ('EMEA'), ('Asia');
quit;

-- Start impala-shell with user-specified substitution variables,
-- run a query, then override the variables with SET and run the query again.
$ impala-shell --var=table_name=staging_table --var=cutoff=2
... banner message ...
[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Query: select s from staging_table order by s limit 2
+--------+
| s      |
+--------+
| bar    |
| bletch |
+--------+
Fetched 2 row(s) in 1.06s

[localhost:21000] > set var:table_name=production_table;
Variable TABLE_NAME set to production_table
[localhost:21000] > set var:cutoff=3;
Variable CUTOFF set to 3

[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Query: select s from production_table order by s limit 3
+---------------+
| s             |
+---------------+
| Asia          |
| EMEA          |
| North America |
+---------------+

The following example shows how SET with no parameters displays all user-specified substitution variables, and how UNSET removes the substitution variable entirely:

[localhost:21000] > set;
Query options (defaults shown in []):
  ABORT_ON_DEFAULT_LIMIT_EXCEEDED: [0]
  ...
  V_CPU_CORES: [0]

Shell Options
  LIVE_PROGRESS: False
  LIVE_SUMMARY: False

Variables:
  CUTOFF: 3
  TABLE_NAME: staging_table

[localhost:21000] > unset var:cutoff;
Unsetting variable CUTOFF
[localhost:21000] > select s from ${var:table_name} order by s limit ${var:cutoff};
Error: Unknown variable CUTOFF

See Running Commands and SQL Statements in impala-shell for more examples of using the --var, SET, and ${var:varname} substitution technique in impala-shell.

Usage notes:

MEM_LIMIT is probably the most commonly used query option. You can specify a high value to allow a resource-intensive query to complete. For testing how queries would work on memory-constrained systems, you might specify an artificially low value.

Complex type considerations:

Examples:

The following example sets some numeric and some Boolean query options to control usage of memory, disk space, and timeout periods, then runs a query whose success could depend on the options in effect:

set mem_limit=64g;
set DISABLE_UNSAFE_SPILLS=true;
set parquet_file_size=400m;
set RESERVATION_REQUEST_TIMEOUT=900000;
insert overwrite parquet_table select c1, c2, count(c3) from text_table group by c1, c2, c3;

Added in: CDH 5.2.0 (Impala 2.0.0)

SET has always been available as an impala-shell command. Promoting it to a SQL statement lets you use this feature in client applications through the JDBC and ODBC APIs.

Cancellation: Cannot be cancelled.

HDFS permissions: This statement does not touch any HDFS files or directories, therefore no HDFS permissions are required.

Related information:

See Query Options for the SET Statement for the query options you can adjust using this statement.