Understanding Performance using Query Profile
To understand the detailed performance characteristics for a
query, issue the PROFILE
command in
impala-shell immediately after executing a query. This
low-level information includes physical details about memory, CPU, I/O, and
network usage, and thus is only available after the query is actually run.
The PROFILE
command, available in
impala-shell, produces a detailed low-level report
showing how the most recent query was executed. Unlike the
EXPLAIN
plan, this information is only available after
the query has finished. It shows physical details such as the number of
bytes read, maximum memory usage, and so on for each node. You can use
this information to determine if the query is I/O-bound or CPU-bound,
whether some network condition is imposing a bottleneck, whether a
slowdown is affecting some nodes but not others, and to check that
recommended configuration settings such as short-circuit local reads are
in effect.
By default, time values in the profile output reflect the wall-clock
time taken by an operation. For values denoting system time or user time,
the measurement unit is reflected in the metric name, such as
ScannerThreadsSysTime
or
ScannerThreadsUserTime
. For example, a multi-threaded
I/O operation might show a small figure for wall-clock time, while the
corresponding system time is larger, representing the sum of the CPU time
taken by each thread. Or a wall-clock time figure might be larger because
it counts time spent waiting, while the corresponding system and user time
figures only measure the time while the operation is actively using CPU
cycles.
The EXPLAIN
plan is also printed at the beginning of
the query profile report, for convenience in examining both the logical
and physical aspects of the query side-by-side. The
EXPLAIN_LEVEL
query option also controls the verbosity
of the EXPLAIN
output printed by the
PROFILE
command.
The Per Node Profiles
section in the profile output
includes the following metrics that can be controlled by the
RESOURCE_TRACE_RATIO
query option.
CpuIoWaitPercentage
CpuSysPercentage
CpuUserPercentage
HostDiskReadThroughput
: All data read by the host as part of the execution of this query (spilling), by the HDFS data node, and by other processes running on the same system.HostDiskWriteThroughput
: All data written by the host as part of the execution of this query (spilling), by the HDFS data node, and by other processes running on the same system.HostNetworkRx
: All data received by the host as part of the execution of this query, other queries, and other processes running on the same system.HostNetworkTx
: All data transmitted by the host as part of the execution of this query, other queries, and other processes running on the same system.