MultiTableInputFormatBase (Hortonworks Data Platform Apache HBase Java API Reference)

java.lang.Object
- <any>
- - org.apache.hadoop.hbase.mapreduce.MultiTableInputFormatBase

Direct Known Subclasses:

MultiTableInputFormat
```
public abstract class MultiTableInputFormatBase
extends <any>
```
A base for MultiTableInputFormats. Receives a list of Scan instances that define the input tables and filters etc. Subclasses may use other TableRecordReader implementations.

Constructor Summary

Constructors
Constructor and Description

MultiTableInputFormatBase()

Constructors
Constructor and Description
`MultiTableInputFormatBase()`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`<any>`	`createRecordReader(InputSplit split, TaskAttemptContext context)` Builds a TableRecordReader.
`protected java.util.List<Scan>`	`getScans()` Allows subclasses to get the list of `Scan` objects.
`java.util.List<InputSplit>`	`getSplits(JobContext context)` Calculates the splits that will serve as input for the map tasks.
`protected boolean`	`includeRegionInSplit(byte[] startKey, byte[] endKey)` Test if the given region is to be included in the InputSplit while splitting the regions of a table.
`protected void`	`setScans(java.util.List<Scan> scans)` Allows subclasses to set the list of `Scan` objects.
`protected void`	`setTableRecordReader(TableRecordReader tableRecordReader)` Allows subclasses to set the `TableRecordReader`.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - MultiTableInputFormatBase
```
public MultiTableInputFormatBase()
```
- Method Detail
  - createRecordReader
```
public <any> createRecordReader(InputSplit split,
                                TaskAttemptContext context)
                         throws java.io.IOException,
                                java.lang.InterruptedException
```
    Builds a TableRecordReader. If no TableRecordReader was provided, uses the default.
    
    Parameters:
    
    split - The split to work with.
    
    context - The current context.
    
    Returns:
    
    The newly created record reader.
    
    Throws:
    
    java.io.IOException - When creating the reader fails.
    
    java.lang.InterruptedException - when record reader initialization fails
    
    See Also:
    
    org.apache.hadoop.mapreduce.InputFormat#createRecordReader( org.apache.hadoop.mapreduce.InputSplit, org.apache.hadoop.mapreduce.TaskAttemptContext)
  - getSplits
```
public java.util.List<InputSplit> getSplits(JobContext context)
                                     throws java.io.IOException
```
    Calculates the splits that will serve as input for the map tasks. The number of splits matches the number of regions in a table.
    
    Parameters:
    
    context - The current job context.
    
    Returns:
    
    The list of input splits.
    
    Throws:
    
    java.io.IOException - When creating the list of splits fails.
    
    See Also:
    
    org.apache.hadoop.mapreduce.InputFormat#getSplits(org.apache.hadoop.mapreduce.JobContext)
  - includeRegionInSplit
```
protected boolean includeRegionInSplit(byte[] startKey,
                                       byte[] endKey)
```
    Test if the given region is to be included in the InputSplit while splitting the regions of a table.
    This optimization is effective when there is a specific reasoning to exclude an entire region from the M-R job, (and hence, not contributing to the InputSplit), given the start and end keys of the same.
    Useful when we need to remember the last-processed top record and revisit the [last, current) interval for M-R processing, continuously. In addition to reducing InputSplits, reduces the load on the region server as well, due to the ordering of the keys.
    
    Note: It is possible that endKey.length() == 0 , for the last (recent) region.
    Override this method, if you want to bulk exclude regions altogether from M-R. By default, no region is excluded( i.e. all regions are included).
    
    Parameters:
    
    startKey - Start key of the region
    
    endKey - End key of the region
    
    Returns:
    
    true, if this region needs to be included as part of the input (default).
  - getScans
```
protected java.util.List<Scan> getScans()
```
    Allows subclasses to get the list of Scan objects.
  - setScans
```
protected void setScans(java.util.List<Scan> scans)
```
    Allows subclasses to set the list of Scan objects.
    
    Parameters:
    
    scans - The list of Scan used to define the input
  - setTableRecordReader
```
protected void setTableRecordReader(TableRecordReader tableRecordReader)
```
    Allows subclasses to set the TableRecordReader.
    
    Parameters:
    
    tableRecordReader - A different TableRecordReader implementation.

Class MultiTableInputFormatBase

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Detail

MultiTableInputFormatBase

Method Detail

createRecordReader

getSplits

includeRegionInSplit

getScans

setScans

setTableRecordReader