org.apache.hadoop.hive.ql.exec.vector
Class VectorizedRowBatchCtx

java.lang.Object
  extended by org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatchCtx

public class VectorizedRowBatchCtx
extends Object

Context for Vectorized row batch. this calss does eager deserialization of row data using serde in the RecordReader layer. It has supports partitions in this layer so that the vectorized batch is populated correctly with the partition column.


Constructor Summary
VectorizedRowBatchCtx()
          Constructor for VectorizedRowBatchCtx
VectorizedRowBatchCtx(StructObjectInspector rawRowOI, StructObjectInspector rowOI, Deserializer deserializer, Map<String,Object> partitionValues, Map<String,PrimitiveObjectInspector.PrimitiveCategory> partitionTypes)
          Constructor for VectorizedRowBatchCtx
 
Method Summary
 void addPartitionColsToBatch(VectorizedRowBatch batch)
          Add the partition values to the batch
 void addRowToBatch(int rowIndex, org.apache.hadoop.io.Writable rowBlob, VectorizedRowBatch batch, org.apache.hadoop.io.DataOutputBuffer buffer)
          Adds the row to the batch after deserializing the row
 void convertRowBatchBlobToVectorizedBatch(Object rowBlob, int rowsInBlob, VectorizedRowBatch batch)
          Deserialized set of rows and populates the batch
 VectorizedRowBatch createVectorizedRowBatch()
          Creates a Vectorized row batch and the column vectors.
 void init(org.apache.hadoop.conf.Configuration hiveConf, org.apache.hadoop.mapred.FileSplit split)
          Initializes VectorizedRowBatch context based on the split and Hive configuration (Job conf with hive Plan).
 void init(org.apache.hadoop.conf.Configuration hiveConf, String fileKey, StructObjectInspector rowOI)
          Initializes the VectorizedRowBatch context based on an arbitrary object inspector Used by non-tablescan operators when they change the vectorization context
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

VectorizedRowBatchCtx

public VectorizedRowBatchCtx(StructObjectInspector rawRowOI,
                             StructObjectInspector rowOI,
                             Deserializer deserializer,
                             Map<String,Object> partitionValues,
                             Map<String,PrimitiveObjectInspector.PrimitiveCategory> partitionTypes)
Constructor for VectorizedRowBatchCtx

Parameters:
rawRowOI - OI for raw row data (EG without partition cols)
rowOI - OI for the row (Raw row OI + partition OI)
deserializer - Deserializer for the row data
partitionValues - Hash map of partition values. Key=TblColName value=PartitionValue

VectorizedRowBatchCtx

public VectorizedRowBatchCtx()
Constructor for VectorizedRowBatchCtx

Method Detail

init

public void init(org.apache.hadoop.conf.Configuration hiveConf,
                 String fileKey,
                 StructObjectInspector rowOI)
Initializes the VectorizedRowBatch context based on an arbitrary object inspector Used by non-tablescan operators when they change the vectorization context

Parameters:
hiveConf -
fileKey - The key on which to retrieve the extra column mapping from the map scratch
rowOI - Object inspector that shapes the column types

init

public void init(org.apache.hadoop.conf.Configuration hiveConf,
                 org.apache.hadoop.mapred.FileSplit split)
          throws ClassNotFoundException,
                 IOException,
                 SerDeException,
                 InstantiationException,
                 IllegalAccessException,
                 HiveException
Initializes VectorizedRowBatch context based on the split and Hive configuration (Job conf with hive Plan).

Parameters:
hiveConf - Hive configuration using Hive plan is extracted
split - File split of the file being read
Throws:
ClassNotFoundException
IOException
SerDeException
InstantiationException
IllegalAccessException
HiveException

createVectorizedRowBatch

public VectorizedRowBatch createVectorizedRowBatch()
                                            throws HiveException
Creates a Vectorized row batch and the column vectors.

Returns:
VectorizedRowBatch
Throws:
HiveException

addRowToBatch

public void addRowToBatch(int rowIndex,
                          org.apache.hadoop.io.Writable rowBlob,
                          VectorizedRowBatch batch,
                          org.apache.hadoop.io.DataOutputBuffer buffer)
                   throws HiveException,
                          SerDeException
Adds the row to the batch after deserializing the row

Parameters:
rowIndex - Row index in the batch to which the row is added
rowBlob - Row blob (serialized version of row)
batch - Vectorized batch to which the row is added
buffer - a buffer to copy strings into
Throws:
HiveException
SerDeException

convertRowBatchBlobToVectorizedBatch

public void convertRowBatchBlobToVectorizedBatch(Object rowBlob,
                                                 int rowsInBlob,
                                                 VectorizedRowBatch batch)
                                          throws SerDeException
Deserialized set of rows and populates the batch

Parameters:
rowBlob - to deserialize
batch - Vectorized row batch which contains deserialized data
Throws:
SerDeException

addPartitionColsToBatch

public void addPartitionColsToBatch(VectorizedRowBatch batch)
                             throws HiveException
Add the partition values to the batch

Parameters:
batch -
Throws:
HiveException


Copyright © 2014 The Apache Software Foundation. All rights reserved.