org.apache.hadoop.hive.ql.io
Class AcidUtils

java.lang.Object
  extended by org.apache.hadoop.hive.ql.io.AcidUtils

public class AcidUtils
extends Object

Utilities that are shared by all of the ACID input and output formats. They are used by the compactor and cleaner and thus must be format agnostic.


Nested Class Summary
static interface AcidUtils.Directory
           
static class AcidUtils.ParsedDelta
           
 
Field Summary
static String BASE_PREFIX
           
static Pattern BUCKET_DIGIT_PATTERN
           
static String BUCKET_DIGITS
           
static String BUCKET_PREFIX
           
static org.apache.hadoop.fs.PathFilter bucketFileFilter
           
static String DELTA_DIGITS
           
static String DELTA_PREFIX
           
static org.apache.hadoop.fs.PathFilter hiddenFileFilter
           
static Pattern LEGACY_BUCKET_DIGIT_PATTERN
           
 
Method Summary
static org.apache.hadoop.fs.Path createBucketFile(org.apache.hadoop.fs.Path subdir, int bucket)
          Create the bucket filename.
static org.apache.hadoop.fs.Path createFilename(org.apache.hadoop.fs.Path directory, AcidOutputFormat.Options options)
          Create a filename for a bucket file.
static org.apache.hadoop.fs.Path[] deserializeDeltas(org.apache.hadoop.fs.Path root, List<Long> deltas)
          Convert the list of begin/end transaction id pairs to a list of delta directories.
static AcidUtils.Directory getAcidState(org.apache.hadoop.fs.Path directory, org.apache.hadoop.conf.Configuration conf, ValidTxnList txnList)
          Get the ACID state of the given directory.
static org.apache.hadoop.fs.Path[] getPaths(List<AcidUtils.ParsedDelta> deltas)
          Convert a list of deltas to a list of delta directories.
static AcidOutputFormat.Options parseBaseBucketFilename(org.apache.hadoop.fs.Path bucketFile, org.apache.hadoop.conf.Configuration conf)
          Parse a bucket filename back into the options that would have created the file.
static List<Long> serializeDeltas(List<AcidUtils.ParsedDelta> deltas)
          Convert the list of deltas into an equivalent list of begin/end transaction id pairs.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

BASE_PREFIX

public static final String BASE_PREFIX
See Also:
Constant Field Values

DELTA_PREFIX

public static final String DELTA_PREFIX
See Also:
Constant Field Values

BUCKET_PREFIX

public static final String BUCKET_PREFIX
See Also:
Constant Field Values

BUCKET_DIGITS

public static final String BUCKET_DIGITS
See Also:
Constant Field Values

DELTA_DIGITS

public static final String DELTA_DIGITS
See Also:
Constant Field Values

BUCKET_DIGIT_PATTERN

public static final Pattern BUCKET_DIGIT_PATTERN

LEGACY_BUCKET_DIGIT_PATTERN

public static final Pattern LEGACY_BUCKET_DIGIT_PATTERN

hiddenFileFilter

public static final org.apache.hadoop.fs.PathFilter hiddenFileFilter

bucketFileFilter

public static final org.apache.hadoop.fs.PathFilter bucketFileFilter
Method Detail

createBucketFile

public static org.apache.hadoop.fs.Path createBucketFile(org.apache.hadoop.fs.Path subdir,
                                                         int bucket)
Create the bucket filename.

Parameters:
subdir - the subdirectory for the bucket.
bucket - the bucket number
Returns:
the filename

createFilename

public static org.apache.hadoop.fs.Path createFilename(org.apache.hadoop.fs.Path directory,
                                                       AcidOutputFormat.Options options)
Create a filename for a bucket file.

Parameters:
directory - the partition directory
options - the options for writing the bucket
Returns:
the filename that should store the bucket

parseBaseBucketFilename

public static AcidOutputFormat.Options parseBaseBucketFilename(org.apache.hadoop.fs.Path bucketFile,
                                                               org.apache.hadoop.conf.Configuration conf)
Parse a bucket filename back into the options that would have created the file.

Parameters:
bucketFile - the path to a bucket file
conf - the configuration
Returns:
the options used to create that filename

getPaths

public static org.apache.hadoop.fs.Path[] getPaths(List<AcidUtils.ParsedDelta> deltas)
Convert a list of deltas to a list of delta directories.

Parameters:
deltas - the list of deltas out of a Directory object.
Returns:
a list of delta directory paths that need to be read

serializeDeltas

public static List<Long> serializeDeltas(List<AcidUtils.ParsedDelta> deltas)
Convert the list of deltas into an equivalent list of begin/end transaction id pairs.

Parameters:
deltas -
Returns:
the list of transaction ids to serialize

deserializeDeltas

public static org.apache.hadoop.fs.Path[] deserializeDeltas(org.apache.hadoop.fs.Path root,
                                                            List<Long> deltas)
Convert the list of begin/end transaction id pairs to a list of delta directories.

Parameters:
root - the root directory
deltas - list of begin/end transaction id pairs
Returns:
the list of delta paths

getAcidState

public static AcidUtils.Directory getAcidState(org.apache.hadoop.fs.Path directory,
                                               org.apache.hadoop.conf.Configuration conf,
                                               ValidTxnList txnList)
                                        throws IOException
Get the ACID state of the given directory. It finds the minimal set of base and diff directories. Note that because major compactions don't preserve the history, we can't use a base directory that includes a transaction id that we must exclude.

Parameters:
directory - the partition directory to analyze
conf - the configuration
txnList - the list of transactions that we are reading
Returns:
the state of the directory
Throws:
IOException


Copyright © 2014 The Apache Software Foundation. All rights reserved.