Apache Hive Performance Tuning
Also available as:
loading table of contents...

Chapter 2. Hive LLAP on Your Cluster

After setup, Hive LLAP is transparent to Apache Hive users and business intelligence tools. Interactive queries run on Apache Hadoop YARN as an Apache Slider application. You can monitor the real-time performance of the queries through the YARN ResourceManager Web UI or by using Slider and YARN command-line tools. Running through Slider enables you to easily open your cluster, share resources with other applications, remove your cluster, and flexibly utilize your resources. For example, you could run a large Hive LLAP cluster during the day for BI tools, and then reduce usage during nonbusiness hours to use the cluster resources for ETL processing.

Figure 2.1. LLAP on Your Cluster

On your cluster, an extra HiveServer2 instance is installed that is dedicated to interactive queries. You can see this HiveServer2 instance listed in the Hive Summary page of Ambari:

Figure 2.2. Hive Summary

In the YARN ResourceManager Web UI, you can see the queue of Hive LLAP daemons or running queries:

Figure 2.3. ResourceManager Web UI

The Apache Tez ApplicationMasters are the same as the selected concurrency. If you selected a total concurrency of 5, you see 5 Tez ApplicationMasters. The following example shows selecting a concurrency of 2:

Figure 2.4. Concurrency Setting