Query history table format

Learn about the query history table format, which provides detailed information on query execution, including column names, data types, and descriptions.

Table Format

The following table provides an overview of the columns in the query history table, including their names, data types, and descriptions:


Column Name	Data type	Description
query_history_schema_version	int	Schema version of the query history record.
hive_version	string	Hive version running in HiveServer2 when the query was executed.
query_id	string	Hive-assigned identifier for the query.
session_id	string	Hive-assigned identifier for the session.
operation_id	string	Hive-assigned identifier for the operation.
execution_engine	string	Execution engine used to run the query (typically Tez).
execution_mode	string	Indicates whether the query ran in LLAP or Tez container mode.
tez_dag_id	string	Tez DAG ID used for the query, if applicable.
tez_app_id	string	Tez application ID used for the query, if applicable.
tez_session_id	string	Tez session ID used for the query, if applicable.
cluster_id	string	Unique identifier for the cluster instance.
sql	string	SQL query string submitted by the user.
session_type	string	Session type (HIVESERVER2 or OTHER).
hiveserver2_protocol_version	int	Protocol version used by the client, as defined in TCLIService.
cluster_user	string	Effective user on the cluster (typically hive if doAs is disabled).
end_user	string	Authenticated client username.
db_name	string	Database selected when the query was run.
tez_coordinator	string	Address of the Tez coordinator used for the query.
query_state	string	Query state as an OperationState value.
query_type	string	Query type identified by the semantic analyzer (DQL, DDL, DML, DCL, STATS, or empty).
operation	string	Hive operation name based on syntax or semantic analysis.
server_address	string	Address of the HiveServer2 instance the client connected to.
server_port	int	TCP port of the HiveServer2 instance.
client_address	string	IP address of the client that initiated the connection.
start_time_utc	timestamp	UTC timestamp when the query started.
end_time_utc	timestamp	UTC timestamp when the query finished.
start_time	timestamp	Server-local timestamp when the query started.
end_time	timestamp	Server-local timestamp when the query finished.
total_time_ms	bigint	Total execution time in milliseconds.
planning_duration	bigint	Time spent in query compilation and planning (ms).
planning_start_time	timestamp	Timestamp when query planning started.
prepare_plan_duration	bigint	Time spent preparing the DAG for execution (ms).
prepare_plan_start_time	timestamp	Timestamp when the DAG preparation started.
get_session_duration	bigint	Time taken to acquire a Tez session (ms).
get_session_start_time	timestamp	Timestamp when session acquisition started.
execution_duration	bigint	Duration of DAG execution (ms).
execution_start_time	timestamp	Timestamp when DAG execution started.
failure_reason	string	Error message for query failure, if any.
num_rows_fetched	int	Number of rows fetched by the query.
plan	string	Full text of the query plan.
used_tables	string	Comma-separated list of tables used by the query.
exec_summary	string	Full text of the execution summary.
configuration_options_changed	string	Configuration options changed during the query session.
total_tasks	int	Total number of Tez tasks started for the query.
succeeded_tasks	int	Number of successful tasks.
killed_tasks	int	Number of killed tasks.
failed_tasks	int	Number of failed task attempts.
task_duration_millis	bigint	Total task execution time in milliseconds.
node_used_count	int	Number of nodes used to run the query.
node_total_count	int	Total number of nodes visible during query execution.
reduce_input_groups	bigint	Number of reducer input groups.
reduce_input_records	bigint	Number of input records seen by reducers.
spilled_records	bigint	Number of records spilled during shuffle.
num_shuffled_inputs	bigint	Number of physical inputs used during shuffle.
num_failed_shuffle_inputs	bigint	Number of failed attempts to fetch shuffle inputs.
input_records_processed	bigint	Number of input records actually processed.
input_split_length_bytes	bigint	Total size (bytes) of input splits.
output_records	bigint	Number of output records from all vertices.
output_bytes_physical	bigint	Actual bytes written, including compression.
shuffle_chunk_count	bigint	Number of shuffled files processed.
shuffle_bytes	bigint	Total shuffled data size in bytes.
shuffle_bytes_disk_direct	bigint	Bytes shuffled using direct disk access.
shuffle_phase_time	bigint	Time spent in the shuffle phase (ms).
merge_phase_time	bigint	Time spent merging shuffled data (ms).