How Navigator Optimizer Fits into Your Enterprise
Enterprise data warehouses (EDW) are relational databases used primarily for reporting and online analytical processing (OLAP). EDWs aggregate data from online transaction processing (OLTP) databases, which enables analysis and reporting on large aggregated data sets.
To enhance the capabilities of their current EDW workloads while reducing costs, more and more enterprises are attracted to Apache Hadoop’s low-cost, highly scalable frameworks such as Impala and Apache Hive. A popular workload-optimization strategy is to process portions of SQL workloads on Impala or Hive, while retaining operational queries on existing EDW systems.
However, EDW workloads can be comprised of millions of SQL queries. Manually identifying queries that could benefit from migrating to Hadoop is not practical. Even if the queries are identified, deploying them to Hadoop as they are might not work as expected because of the underlying architectural differences between EDW and Hadoop systems.
Successfully offloading EDW workloads to Hadoop is complicated by:
- Identifying the queries that could benefit from migrating to Hadoop,
- Choosing the right data models,
- Redesigning queries to break down their complexity to simpler, more efficient constructs, and
- The large size of most EDW queries.
How Cloudera Navigator Optimizer Can Help?
Navigator Optimizer profiles and analyzes the SQL text in large, complex SQL or Hive workloads to identify the queries that are best suited for Hadoop. In addition, Navigator Optimizer can be used to redesign queries for optimal efficiency on Hadoop. For more information about how Navigator Optimizer can help, see Benefits of Using Navigator Optimizer.