org.apache.hadoop.hive.ql.io
Class SymlinkTextInputFormat
java.lang.Object
org.apache.hadoop.hive.ql.io.SymbolicInputFormat
org.apache.hadoop.hive.ql.io.SymlinkTextInputFormat
- All Implemented Interfaces:
- ContentSummaryInputFormat, ReworkMapredInputFormat, org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable
public class SymlinkTextInputFormat
- extends SymbolicInputFormat
- implements org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>, org.apache.hadoop.mapred.JobConfigurable, ContentSummaryInputFormat, ReworkMapredInputFormat
Symlink file is a text file which contains a list of filename / dirname.
This input method reads symlink files from specified job input paths and
takes the files / directories specified in those symlink files as
actual map-reduce input. The target input data should be in TextInputFormat.
Nested Class Summary |
static class |
SymlinkTextInputFormat.SymlinkTextInputSplit
This input split wraps the FileSplit generated from
TextInputFormat.getSplits(), while setting the original link file path
as job input path. |
Method Summary |
void |
configure(org.apache.hadoop.mapred.JobConf job)
|
org.apache.hadoop.fs.ContentSummary |
getContentSummary(org.apache.hadoop.fs.Path p,
org.apache.hadoop.mapred.JobConf job)
|
org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> |
getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter)
|
org.apache.hadoop.mapred.InputSplit[] |
getSplits(org.apache.hadoop.mapred.JobConf job,
int numSplits)
Parses all target paths from job input directory which contains symlink
files, and splits the target data using TextInputFormat. |
void |
validateInput(org.apache.hadoop.mapred.JobConf job)
For backward compatibility with hadoop 0.17. |
SymlinkTextInputFormat
public SymlinkTextInputFormat()
getRecordReader
public org.apache.hadoop.mapred.RecordReader<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text> getRecordReader(org.apache.hadoop.mapred.InputSplit split,
org.apache.hadoop.mapred.JobConf job,
org.apache.hadoop.mapred.Reporter reporter)
throws IOException
- Specified by:
getRecordReader
in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
- Throws:
IOException
getSplits
public org.apache.hadoop.mapred.InputSplit[] getSplits(org.apache.hadoop.mapred.JobConf job,
int numSplits)
throws IOException
- Parses all target paths from job input directory which contains symlink
files, and splits the target data using TextInputFormat.
- Specified by:
getSplits
in interface org.apache.hadoop.mapred.InputFormat<org.apache.hadoop.io.LongWritable,org.apache.hadoop.io.Text>
- Throws:
IOException
configure
public void configure(org.apache.hadoop.mapred.JobConf job)
- Specified by:
configure
in interface org.apache.hadoop.mapred.JobConfigurable
validateInput
public void validateInput(org.apache.hadoop.mapred.JobConf job)
throws IOException
- For backward compatibility with hadoop 0.17.
- Throws:
IOException
getContentSummary
public org.apache.hadoop.fs.ContentSummary getContentSummary(org.apache.hadoop.fs.Path p,
org.apache.hadoop.mapred.JobConf job)
throws IOException
- Specified by:
getContentSummary
in interface ContentSummaryInputFormat
- Throws:
IOException
Copyright © 2014 The Apache Software Foundation. All rights reserved.