Impala health checks
Lists the Impala health check tests that are performed by Cloudera Observability at the end of an Apache Impala job. They provide performance and query insights, such as pointing out queries that may be causing bottlenecks. You can find the Impala health checks on the Impala Queries page in the Health Check list.
Execution completion health checks
The execution metric determines whether a job failed or passed the Cloudera Observability health check.
Health Check | Description |
---|---|
Failed - Any Healthcheck | Displays jobs that failed at least one health check. |
Metadata/Statistics health checks
The metadata/statistic metrics test the distribution of values in one or more columns of the data table for query optimization.
Health Check | Description | Recommendation |
---|---|---|
Corrupt Table Statistics |
Indicates that these queries contain table statistics that were incorrectly
computed and therefore cannot be used.
|
To address this condition, consider recomputing the table statistics. For more information, see the Impala documentation. |
Missing Table Statistics |
Indicates that no table statistics were computed for query optimization. |
To address this condition, consider computing the table statistics. For more information, see the Impala documentation. |
Optimal configuration health checks
The optimal configuration metrics determine whether the query's operation performance was impacted by insufficient resources
Health Check | Description | Recommendation |
---|---|---|
Aggregation Spilled Partitions |
Indicates that during the query's aggregation operation, data was spilled to disk. This health check is triggered when there is not enough memory to complete the operation. |
To address this condition, consider:
For more information, see the Impala documentation. |
HashJoin Spilled Partitions |
Indicates that during the query's hash join operation, data was spilled to disk. This health check is triggered when there is not enough memory to complete the operation. |
To address this condition, consider:
|
Slow Client |
Indicates that the client consumed the query results slower than expected. |
To address this condition depends on the root cause. For example:
For more information about setting timeout periods for daemons, queries, and sessions, see the Impala documentation. |
Performance health checks
The performance metrics measure the query's execution times.
Health Check | Description | Recommendation |
---|---|---|
Slow Aggregate |
Indicates that the aggregation operations were slower than expected. This health check is triggered when the observed throughput is less than ten
million rows per second.
|
To address this condition depends on the root cause. For example:
|
Slow Code Generation |
Indicates that the compiled code was generated slower than expected. This health check is triggered when the generation time exceeds 20% of the overall query execution time. |
This condition may be triggered due to an overly complex query. For example, if
the query has too many predicates in its To address this condition, consider using the |
Slow HDFS Scan |
Indicates that the time taken to scan data from HDFS was slower than expected.
|
This condition is caused by either a slow disk, extremely complex scan
predicates, or a busy HDFS NameNode.
Depending on the cause, to address this condition consider the following:
|
Slow Hash Join |
Indicates that the hash join operations were slower than expected. This health check is triggered when the observed throughput is less than five million rows per second. |
This condition may be triggered when there are overly complex join predicates or a hash join is causing data to spill to disk. To address this condition, consider simplifying the join predicates or reducing the size on the right-hand side of the join. |
Slow Query Planning |
Indicates that the query plan generated slower than expected. This health check is triggered when the query planning time exceeds 30% of the overall query execution time. |
This condition may be caused by overly complex queries or if a metadata refresh occurred whilst the query was executing. To address this condition, consider simplifying your queries. For example, reduce the number of columns returned, reduce the number of filters, or reduce the number of joins. |
Slow Row Materialization |
Indicates that rows were returned slower than expected. This health check is triggered when it takes more than 20% of the query execution time to return rows. |
This condition may be caused when overly complex expressions are used in the
To address this condition, simplify the query by either reducing the number of columns in the selected list or reducing the number of requested rows. |
Slow Sorting |
Indicates that the sorting operations were slower than expected. This health check is triggered when the observed throughput is less than ten
million rows per second.
|
To address this condition, consider the following:
|
Slow Write Speed |
Indicates that the query's write speed is slower than expected. This health check is triggered when the difference between the actual write time
and the expected write time is more than 20% of the query execution time.
|
This condition may be caused when overly complex expressions are used, too many
columns are specified, or too many rows are requested from the
Depending on the cause, to address this condition consider the following:
|
Query/Schema design health checks
The query/schema design metrics determine whether the query contains inefficient code.
Health Check | Description | Recommendation |
---|---|---|
Insufficient Partitioning |
Indicates that there is an insufficient number of partitions to enable parallel processing. This health check is triggered when the system reads rows that are not required for the query's operation, which increases the query's run-time duration and depletes resources. |
To address this condition, consider:
For more information, see the Impala documentation. |
Many Materialized Columns |
Indicates that an unusually large number of columns were returned for the query. This health check is triggered when the query reads more than 15 columns.
|
To address this condition, consider rewriting the query to return 15 columns or less. |
Skew health checks
The skew metrics compare the performance of the query's operations to other operations within the same job. For optimal performance, operations within the same job should perform the same amount of processing.
Health Check | Description | Recommendation |
---|---|---|
Bytes Read Skew |
Indicates that one of the cluster nodes is reading a significantly larger amount of data than the other nodes in the cluster. |
To address this condition, consider rebalancing the data or using the Impala
For more information, see the Impala documentation. |
Duration Skew | Indicates that one or more cluster nodes are taking longer to execute the query
than others. The skew indicates an uneven distribution of data across cluster nodes. The more evenly the data is distributed, the faster the operations will run on the cluster. Operations that use JOINS and GROUP BY clauses may require rewriting the query or changing the underlying data partitioning to use columns with the most evenly distributed values. |
To address this condition, as a starting point, consider configuring the query so that its processing is distributed evenly across operations. |