org.apache.hadoop.hive.ql.udf.generic
Class GenericUDAFCovariance.GenericUDAFCovarianceEvaluator
java.lang.Object
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCovariance.GenericUDAFCovarianceEvaluator
- All Implemented Interfaces:
- Closeable
- Direct Known Subclasses:
- GenericUDAFCovarianceSample.GenericUDAFCovarianceSampleEvaluator
- Enclosing class:
- GenericUDAFCovariance
public static class GenericUDAFCovariance.GenericUDAFCovarianceEvaluator
- extends GenericUDAFEvaluator
Evaluate the variance using the algorithm described in
http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance,
presumably by Pébay, Philippe (2008), in "Formulas for Robust,
One-Pass Parallel Computation of Covariances and Arbitrary-Order
Statistical Moments", Technical Report SAND2008-6212,
Sandia National Laboratories,
http://infoserve.sandia.gov/sand_doc/2008/086212.pdf
Incremental:
n :
mx_n = mx_(n-1) + [x_n - mx_(n-1)]/n :
my_n = my_(n-1) + [y_n - my_(n-1)]/n :
c_n = c_(n-1) + (x_n - mx_(n-1))*(y_n - my_n) :
Merge:
c_X = c_A + c_B + (mx_A - mx_B)*(my_A - my_B)*n_A*n_B/n_X
This one-pass algorithm is stable.
GenericUDAFCovariance.GenericUDAFCovarianceEvaluator
public GenericUDAFCovariance.GenericUDAFCovarianceEvaluator()
init
public ObjectInspector init(GenericUDAFEvaluator.Mode m,
ObjectInspector[] parameters)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Initialize the evaluator.
- Overrides:
init
in class GenericUDAFEvaluator
- Parameters:
m
- The mode of aggregation.parameters
- The ObjectInspector for the parameters: In PARTIAL1 and COMPLETE
mode, the parameters are original data; In PARTIAL2 and FINAL
mode, the parameters are just partial aggregations (in that case,
the array will always have a single element).
- Returns:
- The ObjectInspector for the return value. In PARTIAL1 and PARTIAL2
mode, the ObjectInspector for the return value of
terminatePartial() call; In FINAL and COMPLETE mode, the
ObjectInspector for the return value of terminate() call.
NOTE: We need ObjectInspector[] (in addition to the TypeInfo[] in
GenericUDAFResolver) for 2 reasons: 1. ObjectInspector contains
more information than TypeInfo; and GenericUDAFEvaluator.init at
execution time. 2. We call GenericUDAFResolver.getEvaluator at
compilation time,
- Throws:
HiveException
getNewAggregationBuffer
public GenericUDAFEvaluator.AggregationBuffer getNewAggregationBuffer()
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get a new aggregation object.
- Specified by:
getNewAggregationBuffer
in class GenericUDAFEvaluator
- Throws:
HiveException
reset
public void reset(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Reset the aggregation. This is useful if we want to reuse the same
aggregation.
- Specified by:
reset
in class GenericUDAFEvaluator
- Throws:
HiveException
iterate
public void iterate(GenericUDAFEvaluator.AggregationBuffer agg,
Object[] parameters)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Iterate through original data.
- Specified by:
iterate
in class GenericUDAFEvaluator
parameters
- The objects of parameters.
- Throws:
HiveException
terminatePartial
public Object terminatePartial(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get partial aggregation result.
- Specified by:
terminatePartial
in class GenericUDAFEvaluator
- Returns:
- partial aggregation result.
- Throws:
HiveException
merge
public void merge(GenericUDAFEvaluator.AggregationBuffer agg,
Object partial)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Merge with partial aggregation result. NOTE: null might be passed in case
there is no input data.
- Specified by:
merge
in class GenericUDAFEvaluator
partial
- The partial aggregation result.
- Throws:
HiveException
terminate
public Object terminate(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get final aggregation result.
- Specified by:
terminate
in class GenericUDAFEvaluator
- Returns:
- final aggregation result.
- Throws:
HiveException
setResult
public void setResult(DoubleWritable result)
getResult
public DoubleWritable getResult()
Copyright © 2014 The Apache Software Foundation. All rights reserved.