org.apache.hadoop.hive.ql.stats
Interface StatsPublisher

All Known Implementing Classes:
CounterStatsPublisher, FSStatsPublisher, JDBCStatsPublisher

public interface StatsPublisher

An interface for any possible implementation for publishing statics.


Method Summary
 boolean closeConnection()
          This method closes the connection to the temporary storage.
 boolean connect(org.apache.hadoop.conf.Configuration hconf)
          This method connects to the intermediate statistics database.
 boolean init(org.apache.hadoop.conf.Configuration hconf)
          This method does the necessary one-time initializations, possibly creating the tables and database (if not exist).
 boolean publishStat(String fileID, Map<String,String> stats)
          This method publishes a given statistic into a disk storage, possibly HBase or MySQL.
 

Method Detail

init

boolean init(org.apache.hadoop.conf.Configuration hconf)
This method does the necessary one-time initializations, possibly creating the tables and database (if not exist). This method is usually called in the Hive client side rather than by the mappers/reducers so that it is initialized only once.

Parameters:
hconf - HiveConf that contains the configurations parameters used to connect to intermediate stats database.
Returns:
true if initialization is successful, false otherwise.

connect

boolean connect(org.apache.hadoop.conf.Configuration hconf)
This method connects to the intermediate statistics database.

Parameters:
hconf - HiveConf that contains the connection parameters.
Returns:
true if connection is successful, false otherwise.

publishStat

boolean publishStat(String fileID,
                    Map<String,String> stats)
This method publishes a given statistic into a disk storage, possibly HBase or MySQL.

Parameters:
fileID - : a string identification the statistics to be published by all mappers/reducers and then gathered. The statID is unique per output partition per task, e.g.,: the output directory name (uniq per FileSinkOperator) + the partition specs (only for dynamic partitions) + taskID (last component of task file)
stats - : a map containing key-value pairs, where key is a string representing the statistic to be published, and value is a string representing the value for the given statistic
Returns:
true if successful, false otherwise

closeConnection

boolean closeConnection()
This method closes the connection to the temporary storage.



Copyright © 2014 The Apache Software Foundation. All rights reserved.