OFFSET clause in a
SELECT query causes the result
set to start some number of rows after the logical first item. The result set is numbered
starting from zero, so
OFFSET 0 produces the same result as leaving out the
OFFSET clause. Always use this clause in combination with
BY (so that it is clear which item should be first, second, and so on) and
LIMIT (so that the result set covers a bounded range, such as items 0-9,
100-199, and so on).
In Impala 1.2.1 and higher, you can combine a
LIMIT clause with an
clause to produce a small result set that is different from a top-N query, for example, to return items 11
through 20. This technique can be used to simulate “paged” results. Because Impala queries typically
involve substantial amounts of I/O, use this technique only for compatibility in cases where you cannot
rewrite the application logic. For best performance and scalability, wherever practical, query as many
items as you expect to need, cache them on the application side, and display small groups of results to
users using application logic.
The following example shows how you could run a “paging” query originally written for a traditional database application. Because typical Impala queries process megabytes or gigabytes of data and read large data files from disk each time, it is inefficient to run a separate query to retrieve each small group of items. Use this technique only for compatibility while porting older applications, then rewrite the application code to use a single query with a large result set, and display pages of results from the cached result set.
[localhost:21000] > create table numbers (x int); [localhost:21000] > insert into numbers select x from very_long_sequence; Inserted 1000000 rows in 1.34s [localhost:21000] > select x from numbers order by x limit 5 offset 0; +----+ | x | +----+ | 1 | | 2 | | 3 | | 4 | | 5 | +----+ [localhost:21000] > select x from numbers order by x limit 5 offset 5; +----+ | x | +----+ | 6 | | 7 | | 8 | | 9 | | 10 | +----+