org.apache.hadoop.hive.ql.optimizer
Class MapJoinProcessor

java.lang.Object
  extended by org.apache.hadoop.hive.ql.optimizer.MapJoinProcessor
All Implemented Interfaces:
Transform

public class MapJoinProcessor
extends Object
implements Transform

Implementation of one of the rule-based map join optimization. User passes hints to specify map-joins and during this optimization, all user specified map joins are converted to MapJoins - the reduce sink operator above the join are converted to map sink operators. In future, once statistics are implemented, this transformation can also be done based on costs.


Nested Class Summary
static class MapJoinProcessor.CurrentMapJoin
          CurrentMapJoin.
static class MapJoinProcessor.Default
          Default.
static class MapJoinProcessor.MapJoinDefault
          MapJoinDefault.
static class MapJoinProcessor.MapJoinFS
          MapJoinFS.
static class MapJoinProcessor.MapJoinWalkerCtx
          MapJoinWalkerCtx.
 
Constructor Summary
MapJoinProcessor()
          empty constructor.
 
Method Summary
static int checkMapJoin(int mapJoinPos, JoinCondDesc[] condns)
           
static MapJoinOperator convertJoinOpMapJoinOp(HiveConf hconf, LinkedHashMap<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap, JoinOperator op, QBJoinTree joinTree, int mapJoinPos, boolean noCheckOuterJoin)
           
static MapJoinOperator convertMapJoin(HiveConf conf, LinkedHashMap<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap, JoinOperator op, QBJoinTree joinTree, int mapJoinPos, boolean noCheckOuterJoin, boolean validateMapJoinTree)
          convert a regular join to a a map-side join.
static MapJoinOperator convertSMBJoinToMapJoin(HiveConf hconf, Map<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap, SMBMapJoinOperator smbJoinOp, QBJoinTree joinTree, int bigTablePos, boolean noCheckOuterJoin)
          convert a sortmerge join to a a map-side join.
 MapJoinOperator generateMapJoinOperator(ParseContext pctx, JoinOperator op, QBJoinTree joinTree, int mapJoinPos)
           
static void genLocalWorkForMapJoin(MapredWork newWork, MapJoinOperator newMapJoinOp, int mapJoinPos)
           
static void genMapJoinOpAndLocalWork(HiveConf conf, MapredWork newWork, JoinOperator op, int mapJoinPos)
          Convert the join to a map-join and also generate any local work needed.
static Set<Integer> getBigTableCandidates(JoinCondDesc[] condns)
          Get a list of big table candidates.
static NodeProcessor getCurrentMapJoin()
           
static NodeProcessor getDefault()
           
static NodeProcessor getMapJoinDefault()
           
static NodeProcessor getMapJoinFS()
           
 ParseContext transform(ParseContext pactx)
          Transform the query tree.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

MapJoinProcessor

public MapJoinProcessor()
empty constructor.

Method Detail

genMapJoinOpAndLocalWork

public static void genMapJoinOpAndLocalWork(HiveConf conf,
                                            MapredWork newWork,
                                            JoinOperator op,
                                            int mapJoinPos)
                                     throws SemanticException
Convert the join to a map-join and also generate any local work needed.

Parameters:
newWork - MapredWork in which the conversion is to happen
op - The join operator that needs to be converted to map-join
bigTablePos -
Throws:
SemanticException

genLocalWorkForMapJoin

public static void genLocalWorkForMapJoin(MapredWork newWork,
                                          MapJoinOperator newMapJoinOp,
                                          int mapJoinPos)
                                   throws SemanticException
Throws:
SemanticException

convertMapJoin

public static MapJoinOperator convertMapJoin(HiveConf conf,
                                             LinkedHashMap<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap,
                                             JoinOperator op,
                                             QBJoinTree joinTree,
                                             int mapJoinPos,
                                             boolean noCheckOuterJoin,
                                             boolean validateMapJoinTree)
                                      throws SemanticException
convert a regular join to a a map-side join.

Parameters:
opParseCtxMap -
op - join operator
joinTree - qb join tree
mapJoinPos - position of the source to be read as part of map-reduce framework. All other sources are cached in memory
noCheckOuterJoin -
Throws:
SemanticException

convertJoinOpMapJoinOp

public static MapJoinOperator convertJoinOpMapJoinOp(HiveConf hconf,
                                                     LinkedHashMap<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap,
                                                     JoinOperator op,
                                                     QBJoinTree joinTree,
                                                     int mapJoinPos,
                                                     boolean noCheckOuterJoin)
                                              throws SemanticException
Throws:
SemanticException

convertSMBJoinToMapJoin

public static MapJoinOperator convertSMBJoinToMapJoin(HiveConf hconf,
                                                      Map<Operator<? extends OperatorDesc>,OpParseContext> opParseCtxMap,
                                                      SMBMapJoinOperator smbJoinOp,
                                                      QBJoinTree joinTree,
                                                      int bigTablePos,
                                                      boolean noCheckOuterJoin)
                                               throws SemanticException
convert a sortmerge join to a a map-side join.

Parameters:
opParseCtxMap -
smbJoinOp - join operator
joinTree - qb join tree
bigTablePos - position of the source to be read as part of map-reduce framework. All other sources are cached in memory
noCheckOuterJoin -
Throws:
SemanticException

generateMapJoinOperator

public MapJoinOperator generateMapJoinOperator(ParseContext pctx,
                                               JoinOperator op,
                                               QBJoinTree joinTree,
                                               int mapJoinPos)
                                        throws SemanticException
Throws:
SemanticException

getBigTableCandidates

public static Set<Integer> getBigTableCandidates(JoinCondDesc[] condns)
Get a list of big table candidates. Only the tables in the returned set can be used as big table in the join operation. The logic here is to scan the join condition array from left to right. If see a inner join, and the bigTableCandidates is empty or the outer join that we last saw is a right outer join, add both side of this inner join to big table candidates only if they are not in bad position. If see a left outer join, set lastSeenRightOuterJoin to false, and the bigTableCandidates is empty, add the left side to it, and if the bigTableCandidates is not empty, do nothing (which means the bigTableCandidates is from left side). If see a right outer join, set lastSeenRightOuterJoin to true, clear the bigTableCandidates, and add right side to the bigTableCandidates, it means the right side of a right outer join always win. If see a full outer join, return empty set immediately (no one can be the big table, can not do a mapjoin).

Parameters:
condns -
Returns:
set of big table candidates

checkMapJoin

public static int checkMapJoin(int mapJoinPos,
                               JoinCondDesc[] condns)
Parameters:
mapJoinPos - the position of big table as determined by either hints or auto conversion.
condns - the join conditions
Returns:
if given mapjoin position is a feasible big table position return same else -1.
Throws:
SemanticException - if given position is not in the big table candidates.

transform

public ParseContext transform(ParseContext pactx)
                       throws SemanticException
Transform the query tree. For each join, check if it is a map-side join (user specified). If yes, convert it to a map-side join.

Specified by:
transform in interface Transform
Parameters:
pactx - current parse context
Returns:
ParseContext
Throws:
SemanticException

getMapJoinFS

public static NodeProcessor getMapJoinFS()

getMapJoinDefault

public static NodeProcessor getMapJoinDefault()

getDefault

public static NodeProcessor getDefault()

getCurrentMapJoin

public static NodeProcessor getCurrentMapJoin()


Copyright © 2014 The Apache Software Foundation. All rights reserved.