Tracking Hive on Tez query execution

You need to know how to monitor Hive on Tez queries during execution. Several tools provide query details, such as execution time.

You can retrieve local fetch details about queries from HiveServer (HS2) logs, assuming you enable a fetch task. You configure the following properties:
hive.fetch.task.conversion
Value: minimal
Some select queries can be converted to a single FETCH task instead of a MapReduce task, minimizing latency. A value of none disables all conversion, minimal converts simple queries such as SELECT * and filter on partition columns, and more converts SELECT queries including FILTERS.
hive.fetch.task.conversion.threshold
Value: 1 GiB

Above this size, queries are converted to fetch tasks.

Increasing the static pool does not speed reads and there is not recommended.

  1. In Cloudera Manager, click Clusters > Hive on Tez > Configuration, and search for fetch.
  2. Accept, or change, the default values of the fetch task properties.
  3. Navigate to the HiveServer log directory and look at the log files.
    In Cloudera Manager, you can find the location of this directory as the value of HiveServer2 Log Directory.
  4. In Cloudera Manager, click Clusters > Hive on Tez > Configuration, and click to the HiveServer Web UI.
  5. Use Data Analytics Studio to track query progress.