Understanding Performance using SUMMARY Report
For an overview of the physical performance characteristics for a
query, issue the SUMMARY
command in
impala-shell
immediately after executing a query. This
condensed information shows which phases of execution took the most time,
and how the estimates for memory usage and number of rows at each phase
compare to the actual values.
Like the EXPLAIN
plan, it is easy to see potential
performance bottlenecks in the SUMMARY
report. Like the
PROFILE
output, the SUMMARY
report is
available after the query is run, and it displays actual timing
statistics.
The SUMMARY
report is also printed at the beginning of
the query profile report described in the PROFILE
output,
for convenience in examining high-level and low-level aspects of the query
side-by-side.
For example, here is a query involving an aggregate function on a
single-node. The different stages of the query and their timings are shown
(rolled up for all nodes), along with estimated and actual values used in
planning the query. In this case, the AVG()
function is
computed for a subset of data on each node (stage 01) and then the
aggregated results from all nodes are combined at the end (stage 03). You
can see which stages took the most time, and whether any estimates were
substantially different than the actual data distribution. (When examining
the time values, be sure to consider the suffixes such as
us
for microseconds and ms
for
milliseconds, rather than just looking for the largest numbers.)
[localhost:21000] > select avg(ss_sales_price) from store_sales where ss_coupon_amt = 0;
+---------------------+
| avg(ss_sales_price) |
+---------------------+
| 37.80770926328327 |
+---------------------+
[localhost:21000] > summary;
+--------------+--------+----------+----------+-------+------------+----------+---------------+-----------------+
| Operator | #Hosts | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail |
+--------------+--------+----------+----------+-------+------------+----------+---------------+-----------------+
| 03:AGGREGATE | 1 | 1.03ms | 1.03ms | 1 | 1 | 48.00 KB | -1 B | MERGE FINALIZE |
| 02:EXCHANGE | 1 | 0ns | 0ns | 1 | 1 | 0 B | -1 B | UNPARTITIONED |
| 01:AGGREGATE | 1 | 30.79ms | 30.79ms | 1 | 1 | 80.00 KB | 10.00 MB | |
| 00:SCAN HDFS | 1 | 5.45s | 5.45s | 2.21M | -1 | 64.05 MB | 432.00 MB | tpc.store_sales |
+--------------+--------+----------+----------+-------+------------+----------+---------------+-----------------+
Notice how the longest initial phase of the query is measured in seconds (s), while later phases working on smaller intermediate results are measured in milliseconds (ms) or even nanoseconds (ns).
Here is an example from a more complicated query, as it would appear in
the PROFILE
output: This example taken from:
https://github.com/cloudera/Impala/commit/af85d3b518089b8840ddea4356947e40d1aca9bd
Operator #Hosts Avg Time Max Time #Rows Est. #Rows Peak Mem Est. Peak Mem Detail
------------------------------------------------------------------------------------------------------------------------
09:MERGING-EXCHANGE 1 79.738us 79.738us 5 5 0 -1.00 B UNPARTITIONED
05:TOP-N 3 84.693us 88.810us 5 5 12.00 KB 120.00 B
04:AGGREGATE 3 5.263ms 6.432ms 5 5 44.00 KB 10.00 MB MERGE FINALIZE
08:AGGREGATE 3 16.659ms 27.444ms 52.52K 600.12K 3.20 MB 15.11 MB MERGE
07:EXCHANGE 3 2.644ms 5.1ms 52.52K 600.12K 0 0 HASH(o_orderpriority)
03:AGGREGATE 3 342.913ms 966.291ms 52.52K 600.12K 10.80 MB 15.11 MB
02:HASH JOIN 3 2s165ms 2s171ms 144.87K 600.12K 13.63 MB 941.01 KB INNER JOIN, BROADCAST
|--06:EXCHANGE 3 8.296ms 8.692ms 57.22K 15.00K 0 0 BROADCAST
| 01:SCAN HDFS 2 1s412ms 1s978ms 57.22K 15.00K 24.21 MB 176.00 MB tpch.orders o
00:SCAN HDFS 3 8s032ms 8s558ms 3.79M 600.12K 32.29 MB 264.00 MB tpch.lineitem l