org.apache.hadoop.hive.ql.optimizer.stats.annotation
Class StatsRulesProcFactory.SelectStatsRule

java.lang.Object
  extended by org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory.DefaultStatsRule
      extended by org.apache.hadoop.hive.ql.optimizer.stats.annotation.StatsRulesProcFactory.SelectStatsRule
All Implemented Interfaces:
NodeProcessor
Enclosing class:
StatsRulesProcFactory

public static class StatsRulesProcFactory.SelectStatsRule
extends StatsRulesProcFactory.DefaultStatsRule
implements NodeProcessor

SELECT operator doesn't change the number of rows emitted from the parent operator. It changes the size of each tuple emitted. In a typical case, where only subset of columns are selected the average row size will reduce as some of the columns will be pruned. In order to accurately compute the average row size, column level statistics is required. Column level statistics stores average size of values in column which can be used to more reliably estimate the reduction in size of each tuple. In the absence of column level statistics, size of columns will be based on data type. For primitive data types size from JavaDataModel will be used and for variable length data types worst case will be assumed.

For more information, refer 'Estimating The Cost Of Operations' chapter in "Database Systems: The Complete Book" by Garcia-Molina et. al.


Constructor Summary
StatsRulesProcFactory.SelectStatsRule()
           
 
Method Summary
 Object process(Node nd, Stack<Node> stack, NodeProcessorCtx procCtx, Object... nodeOutputs)
          Generic process for all ops that don't have specific implementations.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StatsRulesProcFactory.SelectStatsRule

public StatsRulesProcFactory.SelectStatsRule()
Method Detail

process

public Object process(Node nd,
                      Stack<Node> stack,
                      NodeProcessorCtx procCtx,
                      Object... nodeOutputs)
               throws SemanticException
Description copied from interface: NodeProcessor
Generic process for all ops that don't have specific implementations.

Specified by:
process in interface NodeProcessor
Overrides:
process in class StatsRulesProcFactory.DefaultStatsRule
Parameters:
nd - operator to process
procCtx - operator processor context
nodeOutputs - A variable argument list of outputs from other nodes in the walk
Returns:
Object to be returned by the process call
Throws:
SemanticException


Copyright © 2014 The Apache Software Foundation. All rights reserved.