org.apache.hadoop.hive.ql.optimizer.physical
Interface BucketingSortingCtx.BucketSortCol

All Known Implementing Classes:
BucketingSortingCtx.BucketCol, BucketingSortingCtx.SortCol
Enclosing class:
BucketingSortingCtx

public static interface BucketingSortingCtx.BucketSortCol

BucketSortCol. Classes that implement this interface provide a way to store information about equivalent columns as their names and indexes in the schema change going into and out of operators. The definition of equivalent columns is up to the class which uses these classes, e.g. BucketingSortingOpProcFactory. For example, two columns are equivalent if they contain exactly the same data. Though, it's possible that two columns contain exactly the same data and are not known to be equivalent. E.g. SELECT key a, key b FROM (SELECT key, count(*) c FROM src GROUP BY key) s; In this case, assuming this is done in a single map reduce job with the group by operator processed in the reducer, the data coming out of the group by operator will be bucketed by key, which would be at index 0 in the schema, after the outer select operator, the output can be viewed as bucketed by either the column with alias a or the column with alias b. To represent this, there could be a single BucketSortCol implementation instance whose names include both a and b, and whose indexes include both 0 and 1. Implementations of this interface should maintain the restriction that the alias getNames().get(i) should have index getIndexes().get(i) in the schema.


Method Summary
 void addAlias(String name, Integer index)
           
 List<Integer> getIndexes()
           
 List<String> getNames()
           
 

Method Detail

getNames

List<String> getNames()

getIndexes

List<Integer> getIndexes()

addAlias

void addAlias(String name,
              Integer index)


Copyright © 2014 The Apache Software Foundation. All rights reserved.