OpenTelemetry support for Impala

The Impala OpenTelemetry integration enables real-time query observability and centralized telemetry data collection, including lifecycle events and resource usage.

Overview

OpenTelemetry (OTel) provides an open-source solution for collecting, processing, and exporting telemetry data, including metrics from applications. OTel helps users gain visibility into query performance and troubleshoot query failures. Impala supports OpenTelemetry (OTel) in Cloudera Runtime 7.3.2.

Impala telemetry data is integrated with OTel-compatible collectors. This provides a centralized flow of live query insights, with SELECT queries, DMLs, and DDLs represented as OTel traces, and reduces the friction of sourcing data from multiple places.

Impala integration with OTel

Impala integrates the OTel C++ SDK to emit query lifecycle data as OpenTelemetry traces. The system already tracks specific phases and events for each query and records them in the query profile timeline section. By emitting these events to an OTel collector, observability systems can track active queries in near real-time.

Collected telemetry data

Telemetry data emitted from Impala carries crucial information that is currently available only in the query profile and workload management tables. Telemetry data includes the following data:
  1. The initiating user
  2. The SQL statement
  3. Memory estimates and actual use
  4. Other important data related to the query lifecycle

Availability

OTel support for Impala is made effective as of the Cloudera Data Warehouse on cloud 1.12.1 version. After upgrading the Cloudera Data Warehouse version, you must also upgrade existing Impala Virtual Warehouses to enable and configure the OTel integration.