SymlinkTextInputFormat (Hive Query Language 0.13.0.2.1.2.0-402 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.hive.ql.io
Class SymlinkTextInputFormat

java.lang.Object
  org.apache.hadoop.hive.ql.io.SymbolicInputFormat
      org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat

All Implemented Interfaces:: ContentSummaryInputFormat, ReworkMapredInputFormat, org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable

public class SymlinkTextInputFormat
extends SymbolicInputFormat
implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable, ContentSummaryInputFormat, ReworkMapredInputFormat
extends SymbolicInputFormat
implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable, ContentSummaryInputFormat, ReworkMapredInputFormat

Symlink file is a text file which contains a list of filename / dirname. This input method reads symlink files from specified job input paths and takes the files / directories specified in those symlink files as actual map-reduce input. The target input data should be in TextInputFormat.

Nested Class Summary
`static class`	`SymlinkTextInputFormat.SymlinkTextInputSplit` This input split wraps the FileSplit generated from TextInputFormat.getSplits(), while setting the original link file path as job input path.

Constructor Summary
`SymlinkTextInputFormat()`

Method Summary
`void`	`configure(org.apache.hadoop.mapred.JobConf job)`
`org.apache.hadoop.fs.ContentSummary`	`getContentSummary(org.apache.hadoop.fs.Path p, org.apache.hadoop.mapred.JobConf job)`
`org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>`	`getRecordReader(org.apache.hadoop.mapred.InputSplit split, org.apache.hadoop.mapred.JobConf job, org.apache.hadoop.mapred.Reporter reporter)`
`org.apache.hadoop.mapred.InputSplit[]`	`getSplits(org.apache.hadoop.mapred.JobConf job, int numSplits)` Parses all target paths from job input directory which contains symlink files, and splits the target data using TextInputFormat.
`void`	`validateInput(org.apache.hadoop.mapred.JobConf job)` For backward compatibility with hadoop 0.17.

Methods inherited from class org.apache.hadoop.hive.ql.io.SymbolicInputFormat
`rework`

Methods inherited from class java.lang.Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface org.apache.hadoop.hive.ql.io.ReworkMapredInputFormat
`rework`

Constructor Detail

SymlinkTextInputFormat

public SymlinkTextInputFormat()

Method Detail

getRecordReader

public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
                                                                                                                          org.apache.hadoop.mapred.JobConf job,
                                                                                                                          org.apache.hadoop.mapred.Reporter reporter)
                                                                                                                   throws IOException

Specified by:: getRecordReader in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>

Throws:: IOException

getSplits

public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
                                                       int numSplits)
                                                throws IOException

Parses all target paths from job input directory which contains symlink files, and splits the target data using TextInputFormat.

Specified by:: getSplits in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>

Throws:: IOException

configure

public void configure(org.apache.hadoop.mapred.JobConf job)

Specified by:: configure in interface org.apache.hadoop.mapred.JobConfigurable

validateInput

public void validateInput(org.apache.hadoop.mapred.JobConf job)
                   throws IOException

For backward compatibility with hadoop 0.17.

Throws:: IOException

getContentSummary

public org.apache.hadoop.fs.ContentSummary getContentSummary(org.apache.hadoop.fs.Path p,
                                                             org.apache.hadoop.mapred.JobConf job)
                                                      throws IOException

Specified by:: getContentSummary in interface ContentSummaryInputFormat

Throws:: IOException