PTFRowContainer (Hive Query Language 0.13.0.2.1.2.0-402 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.hive.ql.exec.persistence
Class PTFRowContainer<Row extends List<Object>>

java.lang.Object
  org.apache.hadoop.hive.ql.exec.persistence.RowContainer<Row>
      org.apache.hadoop.hive.ql.exec.persistence.PTFRowContainer<Row>

All Implemented Interfaces:: AbstractRowContainer<Row>, AbstractRowContainer.RowIterator<Row>

public class PTFRowContainer<Row extends List<Object>>
extends RowContainer<Row>
extends RowContainer<Row>

Extends the RowContainer functionality to provide random access getAt(i). It extends RowContainer behavior in the following ways:

You must continue to call first to signal the transition from writing to the Container to reading from it.
As rows are being added, positions at which a spill occurs is captured as a BlockInfo object. At this point it captures the offset in the File at which the current Block will be written.
When first is called: we associate with each BlockInfo the File Split that it occurs in.
So in order to read a random row from the Container we do the following:
- Convert the row index into a block number. This is easy because all blocks are the same size, given by the blockSize
- The corresponding BlockInfo tells us the Split that this block starts in. Also by looking at the next Block in the BlockInfos list, we know which Split this block ends in.
- So we arrange to read all the Splits that contain rows for this block. For the first Split we seek to the startOffset that we captured in BlockInfo.
- So after reading the Splits, all rows in this block are in the 'currentReadBlock'
We track the span of the currentReadBlock, using currentReadBlockStartRow,blockSize. So if a row is requested in this span, we don't need to read rows from disk.
If the requested row is in the 'last' block; we point the currentReadBlock to the currentWriteBlock; the same as what RowContainer does.
the getAt leaves the Container in the same state as a next call; so a getAt and next calls can be interspersed.

Nested Class Summary
`static class`	`PTFRowContainer.PTFHiveSequenceFileOutputFormat<K,V>`
`static class`	`PTFRowContainer.PTFSequenceFileInputFormat<K,V>`
`static class`	`PTFRowContainer.PTFSequenceFileRecordReader<K,V>`

Nested classes/interfaces inherited from interface org.apache.hadoop.hive.ql.exec.persistence.AbstractRowContainer
`AbstractRowContainer.RowIterator<ROW>`

Constructor Summary
`PTFRowContainer(int bs, org.apache.hadoop.conf.Configuration jc, org.apache.hadoop.mapred.Reporter reporter)`

Method Summary
`void`	`addRow(Row t)` add a row into the RowContainer
`void`	`clearRows()` Remove all elements in the RowContainer.
`void`	`close()`
`static TableDesc`	`createTableDesc(StructObjectInspector oI)`
`Row`	`first()`
`Row`	`getAt(int rowIdx)`
`Row`	`next()`

Methods inherited from class org.apache.hadoop.hive.ql.exec.persistence.RowContainer
`copyToDFSDirecory, rowCount, rowIter, setKeyObject, setSerDe, setTableDesc`

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Constructor Detail

PTFRowContainer

public PTFRowContainer(int bs,
                       org.apache.hadoop.conf.Configuration jc,
                       org.apache.hadoop.mapred.Reporter reporter)
                throws HiveException

Throws:: HiveException

Method Detail

addRow

public void addRow(Row t)
            throws HiveException

Description copied from interface: AbstractRowContainer

add a row into the RowContainer

Specified by:: addRow in interface AbstractRowContainer<Row extends List<Object>>
Overrides:: addRow in class RowContainer<Row extends List<Object>>

Parameters:: t - row
Throws:: HiveException

first

public Row first()
                               throws HiveException

Specified by:: first in interface AbstractRowContainer.RowIterator<Row extends List<Object>>
Overrides:: first in class RowContainer<Row extends List<Object>>

Throws:: HiveException

public Row next()
                              throws HiveException

Specified by:: next in interface AbstractRowContainer.RowIterator<Row extends List<Object>>
Overrides:: next in class RowContainer<Row extends List<Object>>

Throws:: HiveException

clearRows

public void clearRows()
               throws HiveException

Description copied from class: RowContainer

Remove all elements in the RowContainer.

Specified by:: clearRows in interface AbstractRowContainer<Row extends List<Object>>
Overrides:: clearRows in class RowContainer<Row extends List<Object>>

Throws:: HiveException

close

public void close()
           throws HiveException

Throws:: HiveException

getAt

public Row getAt(int rowIdx)
                               throws HiveException

Throws:: HiveException

createTableDesc

public static TableDesc createTableDesc(StructObjectInspector oI)