Query history table format

Learn about the query history table format, which provides detailed information on query execution, including column names, data types, and descriptions.

Table Format

The following table provides an overview of the columns in the query history table, including their names, data types, and descriptions:
Column Name Data type Description
query_history_schema_version int Schema version of the query history record.
hive_version string Hive version running in HiveServer2 when the query was executed.
query_id string Hive-assigned identifier for the query.
session_id string Hive-assigned identifier for the session.
operation_id string Hive-assigned identifier for the operation.
execution_engine string Execution engine used to run the query (typically Tez).
execution_mode string Indicates whether the query ran in LLAP or Tez container mode.
tez_dag_id string Tez DAG ID used for the query, if applicable.
tez_app_id string Tez application ID used for the query, if applicable.
tez_session_id string Tez session ID used for the query, if applicable.
cluster_id string Unique identifier for the cluster instance.
sql string SQL query string submitted by the user.
session_type string Session type (HIVESERVER2 or OTHER).
hiveserver2_protocol_version int Protocol version used by the client, as defined in TCLIService.
cluster_user string Effective user on the cluster (typically hive if doAs is disabled).
end_user string Authenticated client username.
db_name string Database selected when the query was run.
tez_coordinator string Address of the Tez coordinator used for the query.
query_state string Query state as an OperationState value.
query_type string Query type identified by the semantic analyzer (DQL, DDL, DML, DCL, STATS, or empty).
operation string Hive operation name based on syntax or semantic analysis.
server_address string Address of the HiveServer2 instance the client connected to.
server_port int TCP port of the HiveServer2 instance.
client_address string IP address of the client that initiated the connection.
start_time_utc timestamp UTC timestamp when the query started.
end_time_utc timestamp UTC timestamp when the query finished.
start_time timestamp Server-local timestamp when the query started.
end_time timestamp Server-local timestamp when the query finished.
total_time_ms bigint Total execution time in milliseconds.
planning_duration bigint Time spent in query compilation and planning (ms).
planning_start_time timestamp Timestamp when query planning started.
prepare_plan_duration bigint Time spent preparing the DAG for execution (ms).
prepare_plan_start_time timestamp Timestamp when the DAG preparation started.
get_session_duration bigint Time taken to acquire a Tez session (ms).
get_session_start_time timestamp Timestamp when session acquisition started.
execution_duration bigint Duration of DAG execution (ms).
execution_start_time timestamp Timestamp when DAG execution started.
failure_reason string Error message for query failure, if any.
num_rows_fetched int Number of rows fetched by the query.
plan string Full text of the query plan.
used_tables string Comma-separated list of tables used by the query.
exec_summary string Full text of the execution summary.
configuration_options_changed string Configuration options changed during the query session.
total_tasks int Total number of Tez tasks started for the query.
succeeded_tasks int Number of successful tasks.
killed_tasks int Number of killed tasks.
failed_tasks int Number of failed task attempts.
task_duration_millis bigint Total task execution time in milliseconds.
node_used_count int Number of nodes used to run the query.
node_total_count int Total number of nodes visible during query execution.
reduce_input_groups bigint Number of reducer input groups.
reduce_input_records bigint Number of input records seen by reducers.
spilled_records bigint Number of records spilled during shuffle.
num_shuffled_inputs bigint Number of physical inputs used during shuffle.
num_failed_shuffle_inputs bigint Number of failed attempts to fetch shuffle inputs.
input_records_processed bigint Number of input records actually processed.
input_split_length_bytes bigint Total size (bytes) of input splits.
output_records bigint Number of output records from all vertices.
output_bytes_physical bigint Actual bytes written, including compression.
shuffle_chunk_count bigint Number of shuffled files processed.
shuffle_bytes bigint Total shuffled data size in bytes.
shuffle_bytes_disk_direct bigint Bytes shuffled using direct disk access.
shuffle_phase_time bigint Time spent in the shuffle phase (ms).
merge_phase_time bigint Time spent merging shuffled data (ms).