Identifying and Correcting Inefficient SQL Code

Cloudera Observability dissects inefficient Impala SQL code and provides prescriptive tuning recommendations.

You can immediately improve the processing performance of an inefficient Impala query by rewriting any poor SQL code. For example, to prevent skewness, SQL operations that use JOINS clauses may require changes to the SQL code by selecting columns with the most evenly distributed values. Or, as in the following example, having too many joins and inline views are characteristic of inefficiently written SQL code.

This example uses the Queries chart widget, which provides the Duration, Complexity, CPU, and Memory options that help you identify SQL problems.

In this Impala engine case, several queries displayed a larger than average duration time in the Duration option of the Queries chart widget. When the Complexity option was selected it displayed two queries that contain a large number of joins, as shown in the following image:


Even though Impala can process hundreds of joins in a minute, reducing an inordinate number of joins in an SQL statement will improve performance.

When we drill down further in Cloudera Observability by selecting one of the queries of interest, the Performance Issues, which are related to the SQL statement, display the Cloudera Observability prescriptive tuning recommendations. In this case, the recommendation is to denormalize the tables and materialize the inline views: