org.apache.hadoop.hive.ql.exec
Class Operator<T extends OperatorDesc>

java.lang.Object
  extended by org.apache.hadoop.hive.ql.exec.Operator<T>
All Implemented Interfaces:
Serializable, Cloneable, Node
Direct Known Subclasses:
CollectOperator, CommonJoinOperator, DemuxOperator, DummyStoreOperator, ExtractOperator, FilterOperator, ForwardOperator, GroupByOperator, HashTableDummyOperator, LateralViewForwardOperator, LateralViewJoinOperator, LimitOperator, ListSinkOperator, MapOperator, MuxOperator, PTFOperator, ScriptOperator, SelectOperator, TableScanOperator, TerminalOperator, UDTFOperator, UnionOperator

public abstract class Operator<T extends OperatorDesc>
extends Object
implements Serializable, Cloneable, Node

Base operator implementation.

See Also:
Serialized Form

Nested Class Summary
static interface Operator.OperatorFunc
          OperatorFunc.
static class Operator.State
          State.
 
Field Summary
static String HIVECOUNTERCREATEDFILES
           
static String HIVECOUNTERFATAL
           
 
Constructor Summary
Operator()
           
Operator(org.apache.hadoop.mapred.Reporter reporter)
          Create an operator with a reporter.
 
Method Summary
 boolean acceptLimitPushdown()
          used for LimitPushdownOptimizer if all of the operators between limit and reduce-sink does not remove any input rows in the range of limit count, limit can be pushed down to reduce-sink operator.
 void augmentPlan()
          Called during semantic analysis as operators are being added in order to give them a chance to compute any additional plan information needed.
 void cleanUpInputFileChanged()
           
 void cleanUpInputFileChangedOp()
           
 Operator<? extends OperatorDesc> clone()
           
 Operator<? extends OperatorDesc> cloneOp()
          Clones only the operator.
 Operator<? extends OperatorDesc> cloneRecursiveChildren()
          Recursively clones all the children of the tree, Fixes the pointers to children, parents and the pointers to itself coming from the children.
 void close(boolean abort)
           
 boolean columnNamesRowResolvedCanBeObtained()
           
 String dump(int level)
           
 String dump(int level, HashSet<Integer> seenOpts)
           
 void endGroup()
           
 void flush()
           
 List<Operator<? extends OperatorDesc>> getChildOperators()
           
 ArrayList<Node> getChildren()
          Implements the getChildren function for the Node Interface.
 Map<String,ExprNodeDesc> getColumnExprMap()
          Returns a map of output column name to input expression map Note that currently it returns only key columns for ReduceSink and GroupBy operators.
 T getConf()
           
 org.apache.hadoop.conf.Configuration getConfiguration()
           
 boolean getDone()
           
 ExecMapperContext getExecContext()
           
 Object getGroupKeyObject()
           
 ObjectInspector getGroupKeyObjectInspector()
           
 String getIdentifier()
          This function is not named getId(), to make sure java serialization does NOT serialize it.
 ObjectInspector[] getInputObjInspectors()
           
 String getName()
          Implements the getName function for the Node Interface.
 int getNumChild()
           
 int getNumParent()
           
 String getOperatorId()
           
static String getOperatorName()
           
 OpTraits getOpTraits()
           
 ObjectInspector getOutputObjInspector()
           
 List<Operator<? extends OperatorDesc>> getParentOperators()
           
 RowSchema getSchema()
           
 Statistics getStatistics()
           
 Map<Enum<?>,Long> getStats()
           
abstract  org.apache.hadoop.hive.ql.plan.api.OperatorType getType()
          Return the type of the specific operator among the types in OperatorType.
 void initialize(org.apache.hadoop.conf.Configuration hconf, ObjectInspector[] inputOIs)
          Initializes operators only if all parents have been initialized.
 void initializeLocalWork(org.apache.hadoop.conf.Configuration hconf)
           
 void initOperatorId()
           
 boolean isUseBucketizedHiveInputFormat()
           
 void jobClose(org.apache.hadoop.conf.Configuration conf, boolean success)
          Unlike other operator interfaces which are called from map or reduce task, jobClose is called from the jobclient side once the job has completed.
 void jobCloseOp(org.apache.hadoop.conf.Configuration conf, boolean success)
           
 void logStats()
           
 boolean opAllowedAfterMapJoin()
           
 boolean opAllowedBeforeMapJoin()
           
 boolean opAllowedBeforeSortMergeJoin()
           
 boolean opAllowedConvertMapJoin()
           
 void passExecContext(ExecMapperContext execContext)
          Pass the execContext reference to every child operator
 void preorderMap(Operator.OperatorFunc opFunc)
           
 void processGroup(int tag)
           
abstract  void processOp(Object row, int tag)
          Process the row.
 void removeChild(Operator<? extends OperatorDesc> child)
           
 void removeChildAndAdoptItsChildren(Operator<? extends OperatorDesc> child)
          Remove a child and add all of the child's children to the location of the child
 boolean removeChildren(int depth)
           
 void removeParent(Operator<? extends OperatorDesc> parent)
           
 void replaceChild(Operator<? extends OperatorDesc> child, Operator<? extends OperatorDesc> newChild)
          Replace one child with another at the same position.
 void replaceParent(Operator<? extends OperatorDesc> parent, Operator<? extends OperatorDesc> newParent)
          Replace one parent with another at the same position.
 void reset()
           
static void resetId()
           
 void resetStats()
           
 void setAlias(String alias)
          Store the alias this operator is working on behalf of.
 void setChildOperators(List<Operator<? extends OperatorDesc>> childOperators)
           
 void setColumnExprMap(Map<String,ExprNodeDesc> colExprMap)
           
 void setConf(T conf)
           
 void setExecContext(ExecMapperContext execContext)
           
 void setGroupKeyObject(Object keyObject)
           
 void setGroupKeyObjectInspector(ObjectInspector keyObjectInspector)
           
 void setId(String id)
           
 void setInputObjInspectors(ObjectInspector[] inputObjInspectors)
           
 void setOperatorId(String operatorId)
           
 void setOpTraits(OpTraits metaInfo)
           
 void setOutputCollector(org.apache.hadoop.mapred.OutputCollector out)
           
 void setParentOperators(List<Operator<? extends OperatorDesc>> parentOperators)
           
 void setReporter(org.apache.hadoop.mapred.Reporter rep)
           
 void setSchema(RowSchema rowSchema)
           
 void setStatistics(Statistics stats)
           
 void setUseBucketizedHiveInputFormat(boolean useBucketizedHiveInputFormat)
           
 void startGroup()
           
 boolean supportAutomaticSortMergeJoin()
          Whether this operator supports automatic sort merge join.
 boolean supportSkewJoinOptimization()
           
 boolean supportUnionRemoveOptimization()
           
 String toString()
           
static String toString(Collection<Operator<? extends OperatorDesc>> top)
           
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

HIVECOUNTERCREATEDFILES

public static final String HIVECOUNTERCREATEDFILES
See Also:
Constant Field Values

HIVECOUNTERFATAL

public static final String HIVECOUNTERFATAL
See Also:
Constant Field Values
Constructor Detail

Operator

public Operator()

Operator

public Operator(org.apache.hadoop.mapred.Reporter reporter)
Create an operator with a reporter.

Parameters:
reporter - Used to report progress of certain operators.
Method Detail

resetId

public static void resetId()

setChildOperators

public void setChildOperators(List<Operator<? extends OperatorDesc>> childOperators)

getConfiguration

public org.apache.hadoop.conf.Configuration getConfiguration()

getChildOperators

public List<Operator<? extends OperatorDesc>> getChildOperators()

getNumChild

public int getNumChild()

getChildren

public ArrayList<Node> getChildren()
Implements the getChildren function for the Node Interface.

Specified by:
getChildren in interface Node
Returns:
List

setParentOperators

public void setParentOperators(List<Operator<? extends OperatorDesc>> parentOperators)

getParentOperators

public List<Operator<? extends OperatorDesc>> getParentOperators()

getNumParent

public int getNumParent()

setConf

public void setConf(T conf)

getConf

public T getConf()

getDone

public boolean getDone()

setSchema

public void setSchema(RowSchema rowSchema)

getSchema

public RowSchema getSchema()

setId

public void setId(String id)

getIdentifier

public String getIdentifier()
This function is not named getId(), to make sure java serialization does NOT serialize it. Some TestParse tests will fail if we serialize this field, since the Operator ID will change based on the number of query tests.


setReporter

public void setReporter(org.apache.hadoop.mapred.Reporter rep)

setOutputCollector

public void setOutputCollector(org.apache.hadoop.mapred.OutputCollector out)

setAlias

public void setAlias(String alias)
Store the alias this operator is working on behalf of.


getStats

public Map<Enum<?>,Long> getStats()

initialize

public void initialize(org.apache.hadoop.conf.Configuration hconf,
                       ObjectInspector[] inputOIs)
                throws HiveException
Initializes operators only if all parents have been initialized. Calls operator specific initializer which then initializes child ops.

Parameters:
hconf -
inputOIs - input object inspector array indexes by tag id. null value is ignored.
Throws:
HiveException

initializeLocalWork

public void initializeLocalWork(org.apache.hadoop.conf.Configuration hconf)
                         throws HiveException
Throws:
HiveException

passExecContext

public void passExecContext(ExecMapperContext execContext)
Pass the execContext reference to every child operator


getInputObjInspectors

public ObjectInspector[] getInputObjInspectors()

setInputObjInspectors

public void setInputObjInspectors(ObjectInspector[] inputObjInspectors)

getOutputObjInspector

public ObjectInspector getOutputObjInspector()

processOp

public abstract void processOp(Object row,
                               int tag)
                        throws HiveException
Process the row.

Parameters:
row - The object representing the row.
tag - The tag of the row usually means which parent this row comes from. Rows with the same tag should have exactly the same rowInspector all the time.
Throws:
HiveException

startGroup

public void startGroup()
                throws HiveException
Throws:
HiveException

endGroup

public void endGroup()
              throws HiveException
Throws:
HiveException

flush

public void flush()
           throws HiveException
Throws:
HiveException

processGroup

public void processGroup(int tag)
                  throws HiveException
Throws:
HiveException

close

public void close(boolean abort)
           throws HiveException
Throws:
HiveException

jobCloseOp

public void jobCloseOp(org.apache.hadoop.conf.Configuration conf,
                       boolean success)
                throws HiveException
Throws:
HiveException

jobClose

public void jobClose(org.apache.hadoop.conf.Configuration conf,
                     boolean success)
              throws HiveException
Unlike other operator interfaces which are called from map or reduce task, jobClose is called from the jobclient side once the job has completed.

Parameters:
conf - Configuration with with which job was submitted
success - whether the job was completed successfully or not
Throws:
HiveException

replaceChild

public void replaceChild(Operator<? extends OperatorDesc> child,
                         Operator<? extends OperatorDesc> newChild)
Replace one child with another at the same position. The parent of the child is not changed

Parameters:
child - the old child
newChild - the new child

removeChild

public void removeChild(Operator<? extends OperatorDesc> child)

removeChildAndAdoptItsChildren

public void removeChildAndAdoptItsChildren(Operator<? extends OperatorDesc> child)
                                    throws SemanticException
Remove a child and add all of the child's children to the location of the child

Parameters:
child - If this operator is not the only parent of the child. There can be unpredictable result.
Throws:
SemanticException

removeParent

public void removeParent(Operator<? extends OperatorDesc> parent)

removeChildren

public boolean removeChildren(int depth)

replaceParent

public void replaceParent(Operator<? extends OperatorDesc> parent,
                          Operator<? extends OperatorDesc> newParent)
Replace one parent with another at the same position. Chilren of the new parent are not updated

Parameters:
parent - the old parent
newParent - the new parent

resetStats

public void resetStats()

reset

public void reset()

preorderMap

public void preorderMap(Operator.OperatorFunc opFunc)

logStats

public void logStats()

getName

public String getName()
Implements the getName function for the Node Interface.

Specified by:
getName in interface Node
Returns:
the name of the operator

getOperatorName

public static String getOperatorName()

getColumnExprMap

public Map<String,ExprNodeDesc> getColumnExprMap()
Returns a map of output column name to input expression map Note that currently it returns only key columns for ReduceSink and GroupBy operators.

Returns:
null if the operator doesn't change columns

setColumnExprMap

public void setColumnExprMap(Map<String,ExprNodeDesc> colExprMap)

dump

public String dump(int level)

dump

public String dump(int level,
                   HashSet<Integer> seenOpts)

getOperatorId

public String getOperatorId()

initOperatorId

public void initOperatorId()

setOperatorId

public void setOperatorId(String operatorId)

getType

public abstract org.apache.hadoop.hive.ql.plan.api.OperatorType getType()
Return the type of the specific operator among the types in OperatorType.

Returns:
OperatorType.*

setGroupKeyObject

public void setGroupKeyObject(Object keyObject)

getGroupKeyObject

public Object getGroupKeyObject()

augmentPlan

public void augmentPlan()
Called during semantic analysis as operators are being added in order to give them a chance to compute any additional plan information needed. Does nothing by default.


getExecContext

public ExecMapperContext getExecContext()

setExecContext

public void setExecContext(ExecMapperContext execContext)

cleanUpInputFileChanged

public void cleanUpInputFileChanged()
                             throws HiveException
Throws:
HiveException

cleanUpInputFileChangedOp

public void cleanUpInputFileChangedOp()
                               throws HiveException
Throws:
HiveException

supportSkewJoinOptimization

public boolean supportSkewJoinOptimization()

clone

public Operator<? extends OperatorDesc> clone()
                                       throws CloneNotSupportedException
Overrides:
clone in class Object
Throws:
CloneNotSupportedException

cloneOp

public Operator<? extends OperatorDesc> cloneOp()
                                         throws CloneNotSupportedException
Clones only the operator. The children and parent are set to null.

Returns:
Cloned operator
Throws:
CloneNotSupportedException

cloneRecursiveChildren

public Operator<? extends OperatorDesc> cloneRecursiveChildren()
                                                        throws CloneNotSupportedException
Recursively clones all the children of the tree, Fixes the pointers to children, parents and the pointers to itself coming from the children. It does not fix the pointers to itself coming from parents, parents continue to point to the original child.

Returns:
Cloned operator
Throws:
CloneNotSupportedException

columnNamesRowResolvedCanBeObtained

public boolean columnNamesRowResolvedCanBeObtained()

isUseBucketizedHiveInputFormat

public boolean isUseBucketizedHiveInputFormat()

setUseBucketizedHiveInputFormat

public void setUseBucketizedHiveInputFormat(boolean useBucketizedHiveInputFormat)

supportAutomaticSortMergeJoin

public boolean supportAutomaticSortMergeJoin()
Whether this operator supports automatic sort merge join. The stack is traversed, and this method is invoked for all the operators.

Returns:
TRUE if yes, FALSE otherwise.

supportUnionRemoveOptimization

public boolean supportUnionRemoveOptimization()

opAllowedBeforeMapJoin

public boolean opAllowedBeforeMapJoin()

opAllowedAfterMapJoin

public boolean opAllowedAfterMapJoin()

opAllowedConvertMapJoin

public boolean opAllowedConvertMapJoin()

opAllowedBeforeSortMergeJoin

public boolean opAllowedBeforeSortMergeJoin()

acceptLimitPushdown

public boolean acceptLimitPushdown()
used for LimitPushdownOptimizer if all of the operators between limit and reduce-sink does not remove any input rows in the range of limit count, limit can be pushed down to reduce-sink operator. forward, select, etc.


toString

public String toString()
Overrides:
toString in class Object

toString

public static String toString(Collection<Operator<? extends OperatorDesc>> top)

getStatistics

public Statistics getStatistics()

getOpTraits

public OpTraits getOpTraits()

setOpTraits

public void setOpTraits(OpTraits metaInfo)

setStatistics

public void setStatistics(Statistics stats)

setGroupKeyObjectInspector

public void setGroupKeyObjectInspector(ObjectInspector keyObjectInspector)

getGroupKeyObjectInspector

public ObjectInspector getGroupKeyObjectInspector()


Copyright © 2014 The Apache Software Foundation. All rights reserved.