org.apache.hadoop.hive.ql.metadata
Interface HiveStorageHandler

All Superinterfaces:
org.apache.hadoop.conf.Configurable
All Known Implementing Classes:
DefaultStorageHandler

public interface HiveStorageHandler
extends org.apache.hadoop.conf.Configurable

HiveStorageHandler defines a pluggable interface for adding new storage handlers to Hive. A storage handler consists of a bundle of the following:

Storage handler classes are plugged in using the STORED BY 'classname' clause in CREATE TABLE.


Method Summary
 void configureInputJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
          This method is called to allow the StorageHandlers the chance to populate the JobContext.getConfiguration() with properties that maybe be needed by the handler's bundled artifacts (ie InputFormat, SerDe, etc).
 void configureJobConf(TableDesc tableDesc, org.apache.hadoop.mapred.JobConf jobConf)
          Called just before submitting MapReduce job.
 void configureOutputJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
          This method is called to allow the StorageHandlers the chance to populate the JobContext.getConfiguration() with properties that maybe be needed by the handler's bundled artifacts (ie InputFormat, SerDe, etc).
 void configureTableJobProperties(TableDesc tableDesc, Map<String,String> jobProperties)
          Deprecated. 
 HiveAuthorizationProvider getAuthorizationProvider()
          Returns the implementation specific authorization provider
 Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()
           
 HiveMetaHook getMetaHook()
           
 Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()
           
 Class<? extends SerDe> getSerDeClass()
           
 
Methods inherited from interface org.apache.hadoop.conf.Configurable
getConf, setConf
 

Method Detail

getInputFormatClass

Class<? extends org.apache.hadoop.mapred.InputFormat> getInputFormatClass()
Returns:
Class providing an implementation of InputFormat

getOutputFormatClass

Class<? extends org.apache.hadoop.mapred.OutputFormat> getOutputFormatClass()
Returns:
Class providing an implementation of OutputFormat

getSerDeClass

Class<? extends SerDe> getSerDeClass()
Returns:
Class providing an implementation of SerDe

getMetaHook

HiveMetaHook getMetaHook()
Returns:
metadata hook implementation, or null if this storage handler does not need any metadata notifications

getAuthorizationProvider

HiveAuthorizationProvider getAuthorizationProvider()
                                                   throws HiveException
Returns the implementation specific authorization provider

Returns:
authorization provider
Throws:
HiveException

configureInputJobProperties

void configureInputJobProperties(TableDesc tableDesc,
                                 Map<String,String> jobProperties)
This method is called to allow the StorageHandlers the chance to populate the JobContext.getConfiguration() with properties that maybe be needed by the handler's bundled artifacts (ie InputFormat, SerDe, etc). Key value pairs passed into jobProperties are guaranteed to be set in the job's configuration object. User's can retrieve "context" information from tableDesc. User's should avoid mutating tableDesc and only make changes in jobProperties. This method is expected to be idempotent such that a job called with the same tableDesc values should return the same key-value pairs in jobProperties. Any external state set by this method should remain the same if this method is called again. It is up to the user to determine how best guarantee this invariant. This method in particular is to create a configuration for input.

Parameters:
tableDesc - descriptor for the table being accessed
jobProperties - receives properties copied or transformed from the table properties

configureOutputJobProperties

void configureOutputJobProperties(TableDesc tableDesc,
                                  Map<String,String> jobProperties)
This method is called to allow the StorageHandlers the chance to populate the JobContext.getConfiguration() with properties that maybe be needed by the handler's bundled artifacts (ie InputFormat, SerDe, etc). Key value pairs passed into jobProperties are guaranteed to be set in the job's configuration object. User's can retrieve "context" information from tableDesc. User's should avoid mutating tableDesc and only make changes in jobProperties. This method is expected to be idempotent such that a job called with the same tableDesc values should return the same key-value pairs in jobProperties. Any external state set by this method should remain the same if this method is called again. It is up to the user to determine how best guarantee this invariant. This method in particular is to create a configuration for output.

Parameters:
tableDesc - descriptor for the table being accessed
jobProperties - receives properties copied or transformed from the table properties

configureTableJobProperties

@Deprecated
void configureTableJobProperties(TableDesc tableDesc,
                                            Map<String,String> jobProperties)
Deprecated. 

Deprecated use configureInputJobProperties/configureOutputJobProperties methods instead. Configures properties for a job based on the definition of the source or target table it accesses.

Parameters:
tableDesc - descriptor for the table being accessed
jobProperties - receives properties copied or transformed from the table properties

configureJobConf

void configureJobConf(TableDesc tableDesc,
                      org.apache.hadoop.mapred.JobConf jobConf)
Called just before submitting MapReduce job.

Parameters:
tableDesc - descriptor for the table being accessed
JobConf - jobConf for MapReduce job


Copyright © 2014 The Apache Software Foundation. All rights reserved.