Managing Apache HivePDF version

Check query execution

You can determine if query vectorization occurred during execution by running the EXPLAIN VECTORIZATION query statement.

  1. Start Hive from Beeline.
  2. Run the EXPLAIN VECTORIZATION statement on the query you want CDP to process using vectorization.
    EXPLAIN VECTORIZATION SELECT COUNT(*) FROM employees where emp_no>10;
    The following output confirms that vectorization occurs:
    +----------------------------------------------------+
    |                      Explain                       |
    +----------------------------------------------------+
    | Plan optimized by CBO.                             |
    |                                                    |
    | Vertex dependency in root stage                    |
    | Reducer 2 <- Map 1 [CUSTOM_SIMPLE_EDGE] *vectorized*|
    |       File Output Operator [FS_14]                 |
    |         Group By Operator [GBY_13] (rows=1 width=12) |
    |           Output:["_col0"],aggregations:["count(VALUE._col0)"] |
    |         <-Map 1 [CUSTOM_SIMPLE_EDGE] vectorized    | 
    |           PARTITION_ONLY_SHUFFLE [RS_12]           |
    |             Group By Operator [GBY_11] (rows=1 width=12) |
    |               Output:["_col0"],aggregations:["count()"] |
    |               Select Operator [SEL_10] (rows=1 width=4) |
    |                 Filter Operator [FIL_9] (rows=1 width=4) |
    |                   predicate:(emp_no > 10)          |
    |                   TableScan [TS_0] (rows=1 width=4) |
    |                     default@employees,employees,Tbl:COMPLETE,Col:NONE,Output:["emp_no"] |
    |                                                    |
    +----------------------------------------------------+
    23 rows selected (1.542 seconds)
             

We want your opinion

How can we improve this page?

What kind of feedback do you have?