Workload Optimization Recommendations
After you upload a workload to Navigator Optimizer, identify a set of queries to analyze, and then click Add to Design. This adds the queries to a design block, which is a section of the Design page that enables you to analyze the queries in depth. Design blocks generate workload optimization recommendations to better exploit the features offered on SQL-on-Hadoop platforms which results in optimized query performance. They are generated based on the specific queries you add. For example, if you add a group of queries that use complex joins, Navigator Optimizer might generate denormalization recommendations. The following sections describe the different types of recommendations that can be generated in design blocks.
Cost-Based Denormalization Recommendations
When workloads are analyzed, Navigator Optimizer assigns "costs" to the underlying join patterns exhibited by workloads. These costs are based on the time it takes for joins to execute. Examples of characteristics that drive up cost are 1) the size of the tables participating in the joins, 2) the number of tables participating in the joins, or 3) the number of queries which are affected by the joins. Navigator Optimizer provides denormalization recommendations on the Design page for high cost tables. Denormalization improves the read performance of queries by eliminating run-time joins, thus avoiding multiple table scans. After underlying tables are denormalized by using the recommendations and queries are adjusted, future workloads that access these tables execute more efficiently. Based on this heuristic, Navigator Optimizer ranks its denormalization candidates, pushing the most beneficial candidates to the top.
Partition Key and Table Aggregation Recommendations
Navigator Optimizer recommends partitioning key columns based on table and column statistics and analysis of the filters used in the SQL queries that are included in the workload. Navigator Optimizer requires table row counts and column statistics to identify the right partitioning keys. For example, creating a partition on a column that contains a high number of distinct values (NDVs) can cause the Hive Metastore to run out of memory, which can bring down the cluster. On the other hand, partitioning a small table might introduce unnecessary overhead. Uploading statistics ensures that Navigator Optimizer has the appropriate information to make effective recommendations. After statistics are uploaded, two panels display where you can generate view the partitioning strategies and the potential partition key candidates for the tables that are accessed by the queries in your design block. Then you can choose the partitioning strategy you want and download the DDL for it.
Aggregate table recommendations can also appear in a design block for specific queries. If you receive an Aggregate Tables recommendation in a design block, it provides the DDL for creating a summary table derived from common characteristics across the set of queries. For example, it might examine FROM, WHERE, or GROUP BY clauses across the set of queries. Implementing this summary table can improve performance for those queries.
Inline View Materialization Recommendations
SELECT * FROM ( SELECT deptno, count(*) emp_count FROM emp GROUP BY deptno ) emp, dept WHERE dept.deptno = emp.deptno;Each time the query runs, the subquery must execute to retrieve the data. If you "materialize" the inline view by creating a table from the subquery, then you can reduce the amount of processing required to retrieve the result set. Navigator Optimizer can show the popular inline views used in a query set and then provides DDL you can download to materialize the popular inline views in your workload.
Click the YouTube link at the lower right corner of the video player to view the video at full-screen on YouTube.