WINDOW
Certain analytic functions accept an optional window
clause, which makes the function analyze only certain
rows around
the current row rather than all rows in the
partition. For example, you can get a moving average by specifying
some number of preceding and following rows, or a running count or
running total by specifying all rows up to the current position.
This clause can result in different analytic results for rows
within the same partition.
The window clause is supported with the AVG()
,
COUNT()
, FIRST_VALUE()
,
LAST_VALUE()
, and SUM()
functions.
For
MAX()
and MIN()
, the window
clause only allowed if the start bound is UNBOUNDED
PRECEDING
Syntax:
ROWS BETWEEN [ { m | UNBOUNDED } PRECEDING | CURRENT ROW] [ AND [CURRENT ROW | { UNBOUNDED | n } FOLLOWING] ]
RANGE BETWEEN [ {m | UNBOUNDED } PRECEDING | CURRENT ROW] [ AND [CURRENT ROW | { UNBOUNDED | n } FOLLOWING] ]
ROWS BETWEEN
defines the size of the window in
terms of the indexes of the rows in the result set. The size of
the window is predictable based on the clauses the position within
the result set.
RANGE BETWEEN
does not currently support numeric
arguments to define a variable-size sliding window.
Currently, Impala supports only some combinations of arguments
to the RANGE
clause:
-
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
(the default whenORDER BY
is specified and the window clause is omitted) -
RANGE BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
-
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
When RANGE
is used, CURRENT
ROW
includes not just the current row but all rows that
are tied with the current row based on the ORDER
BY
expressions.
Examples:
The following examples show financial data for a fictional stock
symbol JDR
. The closing price moves up and down
each day.
create table stock_ticker (stock_symbol string, closing_price decimal(8,2), closing_date timestamp);
...load some data...
select * from stock_ticker order by stock_symbol, closing_date
+--------------+---------------+---------------------+
| stock_symbol | closing_price | closing_date |
+--------------+---------------+---------------------+
| JDR | 12.86 | 2014-10-02 00:00:00 |
| JDR | 12.89 | 2014-10-03 00:00:00 |
| JDR | 12.94 | 2014-10-04 00:00:00 |
| JDR | 12.55 | 2014-10-05 00:00:00 |
| JDR | 14.03 | 2014-10-06 00:00:00 |
| JDR | 14.75 | 2014-10-07 00:00:00 |
| JDR | 13.98 | 2014-10-08 00:00:00 |
+--------------+---------------+---------------------+
The queries use analytic functions with window clauses to
compute moving averages of the closing price. For example,
ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING
produces an average of the value from a 3-day span, producing a
different value for each row. The first row, which has no
preceding row, only gets averaged with the row following it. If
the table contained more than one stock symbol, the
PARTITION BY
clause would limit the window for
the moving average to only consider the prices for a single stock.
select stock_symbol, closing_date, closing_price,
avg(closing_price) over (partition by stock_symbol order by closing_date
rows between 1 preceding and 1 following) as moving_average
from stock_ticker;
+--------------+---------------------+---------------+----------------+
| stock_symbol | closing_date | closing_price | moving_average |
+--------------+---------------------+---------------+----------------+
| JDR | 2014-10-02 00:00:00 | 12.86 | 12.87 |
| JDR | 2014-10-03 00:00:00 | 12.89 | 12.89 |
| JDR | 2014-10-04 00:00:00 | 12.94 | 12.79 |
| JDR | 2014-10-05 00:00:00 | 12.55 | 13.17 |
| JDR | 2014-10-06 00:00:00 | 14.03 | 13.77 |
| JDR | 2014-10-07 00:00:00 | 14.75 | 14.25 |
| JDR | 2014-10-08 00:00:00 | 13.98 | 14.36 |
+--------------+---------------------+---------------+----------------+
The clause ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT
ROW
produces a cumulative moving average, from the
earliest data up to the value for each day.
select stock_symbol, closing_date, closing_price,
avg(closing_price) over (partition by stock_symbol order by closing_date
rows between unbounded preceding and current row) as moving_average
from stock_ticker;
+--------------+---------------------+---------------+----------------+
| stock_symbol | closing_date | closing_price | moving_average |
+--------------+---------------------+---------------+----------------+
| JDR | 2014-10-02 00:00:00 | 12.86 | 12.86 |
| JDR | 2014-10-03 00:00:00 | 12.89 | 12.87 |
| JDR | 2014-10-04 00:00:00 | 12.94 | 12.89 |
| JDR | 2014-10-05 00:00:00 | 12.55 | 12.81 |
| JDR | 2014-10-06 00:00:00 | 14.03 | 13.05 |
| JDR | 2014-10-07 00:00:00 | 14.75 | 13.33 |
| JDR | 2014-10-08 00:00:00 | 13.98 | 13.42 |
+--------------+---------------------+---------------+----------------+