org.apache.hadoop.hive.ql.index
Interface HiveIndexHandler

All Superinterfaces:
org.apache.hadoop.conf.Configurable
All Known Implementing Classes:
AbstractIndexHandler, AggregateIndexHandler, BitmapIndexHandler, CompactIndexHandler, TableBasedIndexHandler

public interface HiveIndexHandler
extends org.apache.hadoop.conf.Configurable

HiveIndexHandler defines a pluggable interface for adding new index handlers to Hive.


Method Summary
 void analyzeIndexDefinition(org.apache.hadoop.hive.metastore.api.Table baseTable, org.apache.hadoop.hive.metastore.api.Index index, org.apache.hadoop.hive.metastore.api.Table indexTable)
          Requests that the handler validate an index definition and fill in additional information about its stored representation.
 boolean checkQuerySize(long inputSize, HiveConf conf)
          Check the size of an input query to make sure it fits within the bounds
 List<Task<?>> generateIndexBuildTaskList(Table baseTbl, org.apache.hadoop.hive.metastore.api.Index index, List<Partition> indexTblPartitions, List<Partition> baseTblPartitions, Table indexTbl, Set<ReadEntity> inputs, Set<WriteEntity> outputs)
          Requests that the handler generate a plan for building the index; the plan should read the base table and write out the index representation.
 void generateIndexQuery(List<org.apache.hadoop.hive.metastore.api.Index> indexes, ExprNodeDesc predicate, ParseContext pctx, HiveIndexQueryContext queryContext)
          Generate the list of tasks required to run an index optimized sub-query for the given predicate, using the given indexes.
 boolean usesIndexTable()
          Determines whether this handler implements indexes by creating an index table.
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Method Detail

usesIndexTable

boolean usesIndexTable()
Determines whether this handler implements indexes by creating an index table.

Returns:
true if index creation implies creation of an index table in Hive; false if the index representation is not stored in a Hive table

analyzeIndexDefinition

void analyzeIndexDefinition(org.apache.hadoop.hive.metastore.api.Table baseTable,
                            org.apache.hadoop.hive.metastore.api.Index index,
                            org.apache.hadoop.hive.metastore.api.Table indexTable)
                            throws HiveException
Requests that the handler validate an index definition and fill in additional information about its stored representation.

Parameters:
baseTable - the definition of the table being indexed
index - the definition of the index being created
indexTable - a partial definition of the index table to be used for storing the index representation, or null if usesIndexTable() returns false; the handler can augment the index's storage descriptor (e.g. with information about input/output format) and/or the index table's definition (typically with additional columns containing the index representation, e.g. pointers into HDFS).
Throws:
HiveException - if the index definition is invalid with respect to either the base table or the supplied index table definition

generateIndexBuildTaskList

List<Task<?>> generateIndexBuildTaskList(Table baseTbl,
                                         org.apache.hadoop.hive.metastore.api.Index index,
                                         List<Partition> indexTblPartitions,
                                         List<Partition> baseTblPartitions,
                                         Table indexTbl,
                                         Set<ReadEntity> inputs,
                                         Set<WriteEntity> outputs)
                                         throws HiveException
Requests that the handler generate a plan for building the index; the plan should read the base table and write out the index representation.

Parameters:
baseTbl - the definition of the table being indexed
index - the definition of the index
baseTblPartitions - list of base table partitions with each element mirrors to the corresponding one in indexTblPartitions
indexTbl - the definition of the index table, or null if usesIndexTable() returns null
inputs - inputs for hooks, supplemental outputs going along with the return value
outputs - outputs for hooks, supplemental outputs going along with the return value
Returns:
list of tasks to be executed in parallel for building the index
Throws:
HiveException - if plan generation fails

generateIndexQuery

void generateIndexQuery(List<org.apache.hadoop.hive.metastore.api.Index> indexes,
                        ExprNodeDesc predicate,
                        ParseContext pctx,
                        HiveIndexQueryContext queryContext)
Generate the list of tasks required to run an index optimized sub-query for the given predicate, using the given indexes. If multiple indexes are provided, it is up to the handler whether to use none, one, some or all of them. The supplied predicate may reference any of the columns from any of the indexes. If the handler decides to use more than one index, it is responsible for generating tasks to combine their search results (e.g. performing a JOIN on the result).

Parameters:
indexes -
predicate -
pctx -
queryContext - contains results, such as query tasks and input configuration

checkQuerySize

boolean checkQuerySize(long inputSize,
                       HiveConf conf)
Check the size of an input query to make sure it fits within the bounds

Parameters:
inputSize - size (in bytes) of the query in question
conf -
Returns:
true if query is within the bounds


Copyright © 2014 The Apache Software Foundation. All rights reserved.