ListBucketingPruner (Hive Query Language 0.13.0.2.1.2.0-402 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.hive.ql.optimizer.listbucketingpruner
Class ListBucketingPruner

java.lang.Object
  org.apache.hadoop.hive.ql.optimizer.listbucketingpruner.ListBucketingPruner

All Implemented Interfaces:: Transform

public class ListBucketingPruner
extends Object
implements Transform
extends Object
implements Transform

The transformation step that does list bucketing pruning.

Nested Class Summary
`static class`	`ListBucketingPruner.DynamicMultiDimensionalCollection` Note: this class is not designed to be used in general but for list bucketing pruner only.

Constructor Summary
`ListBucketingPruner()`

Method Summary
`static org.apache.hadoop.fs.Path[]`	`prune(ParseContext ctx, Partition part, ExprNodeDesc pruner)` Prunes to the directories which match the skewed keys in where clause.
`ParseContext`	`transform(ParseContext pctx)` All transformation steps implement this interface.

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

ListBucketingPruner

public ListBucketingPruner()

Method Detail

transform

public ParseContext transform(ParseContext pctx)
                       throws SemanticException

Description copied from interface: Transform

All transformation steps implement this interface.

Specified by:: transform in interface Transform

Parameters:: pctx - input parse context
Returns:: ParseContext
Throws:: SemanticException

prune

public static org.apache.hadoop.fs.Path[] prune(ParseContext ctx,
                                                Partition part,
                                                ExprNodeDesc pruner)

Prunes to the directories which match the skewed keys in where clause. Algorithm ========= For each possible skewed element combination: 1. walk through ExprNode tree 2. decide Boolean (True/False/unknown(null)) Go through each skewed element combination again: 1. if it is skewed value, skip the directory only if it is false, otherwise keep it 2. skip the default directory only if all skewed elements,non-skewed value, are false. Example ======= For example: 1. skewed column (list): C1, C2 2. skewed value (list of list): (1,a), (2,b), (1,c) Unique skewed elements for each skewed column (list of list): (1,2,other), (a,b,c,other) Index: (0,1,2) (0,1,2,3) Output matches order of skewed column. Output can be read as: C1 has unique element list (1,2,other) C2 has unique element list (a,b,c,other) C1\C2 | a | b | c |Other 1 | (1,a) | X | (1,c) |X 2 | X |(2,b) | X |X other | X | X | X |X Complete dynamic-multi-dimension collection (0,0) (1,a) * -> T (0,1) (1,b) -> T (0,2) (1,c) *-> F (0,3) (1,other)-> F (1,0) (2,a)-> F (1,1) (2,b) * -> T (1,2) (2,c)-> F (1,3) (2,other)-> F (2,0) (other,a) -> T (2,1) (other,b) -> T (2,2) (other,c) -> T (2,3) (other,other) -> T * is skewed value entry Expression Tree : ((c1=1) and (c2=a)) or ( (c1=3) or (c2=b)) or / \ and or / \ / \ c1=1 c2=a c1=3 c2=b For each entry in dynamic-multi-dimension container 1. walk through the tree to decide value (please see map's value above) 2. if it is skewed value 2.1 remove the entry from the map 2.2 add directory to path unless value is false 3. otherwise, add value to map Once it is done, go through the rest entries in map to decide default directory 1. we know all is not skewed value 2. we skip default directory only if all value is false What we choose at the end? 1. directory for (1,a) because it 's skewed value and match returns true 2. directory for (2,b) because it 's skewed value and match returns true 3. default directory because not all non-skewed value returns false we skip directory for (1,c) since match returns false Note: unknown is marked in transform(ParseContext)

 newcd = new ExprNodeConstantDesc(cd.getTypeInfo(), null)

can be checked via

     child_nd instanceof ExprNodeConstantDesc
               && ((ExprNodeConstantDesc) child_nd).getValue() == null)

Parameters:: ctx - parse context; part - partition; pruner - expression node tree
Returns: