org.apache.hadoop.hive.ql.udf.generic
Class GenericUDAFCorrelation.GenericUDAFCorrelationEvaluator
java.lang.Object
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFCorrelation.GenericUDAFCorrelationEvaluator
- All Implemented Interfaces:
- Closeable
- Enclosing class:
- GenericUDAFCorrelation
public static class GenericUDAFCorrelation.GenericUDAFCorrelationEvaluator
- extends GenericUDAFEvaluator
Evaluate the Pearson correlation coefficient using a stable one-pass
algorithm, based on work by Philippe Pébay and Donald Knuth.
Incremental:
n :
mx_n = mx_(n-1) + [x_n - mx_(n-1)]/n :
my_n = my_(n-1) + [y_n - my_(n-1)]/n :
c_n = c_(n-1) + (x_n - mx_(n-1))*(y_n - my_n) :
vx_n = vx_(n-1) + (x_n - mx_n)(x_n - mx_(n-1)):
vy_n = vy_(n-1) + (y_n - my_n)(y_n - my_(n-1)):
Merge:
c_X = c_A + c_B + (mx_A - mx_B)*(my_A - my_B)*n_A*n_B/n_X
vx_(A,B) = vx_A + vx_B + (mx_A - mx_B)*(mx_A - mx_B)*n_A*n_B/(n_A+n_B)
vy_(A,B) = vy_A + vy_B + (my_A - my_B)*(my_A - my_B)*n_A*n_B/(n_A+n_B)
GenericUDAFCorrelation.GenericUDAFCorrelationEvaluator
public GenericUDAFCorrelation.GenericUDAFCorrelationEvaluator()
init
public ObjectInspector init(GenericUDAFEvaluator.Mode m,
ObjectInspector[] parameters)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Initialize the evaluator.
- Overrides:
init
in class GenericUDAFEvaluator
- Parameters:
m
- The mode of aggregation.parameters
- The ObjectInspector for the parameters: In PARTIAL1 and COMPLETE
mode, the parameters are original data; In PARTIAL2 and FINAL
mode, the parameters are just partial aggregations (in that case,
the array will always have a single element).
- Returns:
- The ObjectInspector for the return value. In PARTIAL1 and PARTIAL2
mode, the ObjectInspector for the return value of
terminatePartial() call; In FINAL and COMPLETE mode, the
ObjectInspector for the return value of terminate() call.
NOTE: We need ObjectInspector[] (in addition to the TypeInfo[] in
GenericUDAFResolver) for 2 reasons: 1. ObjectInspector contains
more information than TypeInfo; and GenericUDAFEvaluator.init at
execution time. 2. We call GenericUDAFResolver.getEvaluator at
compilation time,
- Throws:
HiveException
getNewAggregationBuffer
public GenericUDAFEvaluator.AggregationBuffer getNewAggregationBuffer()
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get a new aggregation object.
- Specified by:
getNewAggregationBuffer
in class GenericUDAFEvaluator
- Throws:
HiveException
reset
public void reset(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Reset the aggregation. This is useful if we want to reuse the same
aggregation.
- Specified by:
reset
in class GenericUDAFEvaluator
- Throws:
HiveException
iterate
public void iterate(GenericUDAFEvaluator.AggregationBuffer agg,
Object[] parameters)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Iterate through original data.
- Specified by:
iterate
in class GenericUDAFEvaluator
parameters
- The objects of parameters.
- Throws:
HiveException
terminatePartial
public Object terminatePartial(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get partial aggregation result.
- Specified by:
terminatePartial
in class GenericUDAFEvaluator
- Returns:
- partial aggregation result.
- Throws:
HiveException
merge
public void merge(GenericUDAFEvaluator.AggregationBuffer agg,
Object partial)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Merge with partial aggregation result. NOTE: null might be passed in case
there is no input data.
- Specified by:
merge
in class GenericUDAFEvaluator
partial
- The partial aggregation result.
- Throws:
HiveException
terminate
public Object terminate(GenericUDAFEvaluator.AggregationBuffer agg)
throws HiveException
- Description copied from class:
GenericUDAFEvaluator
- Get final aggregation result.
- Specified by:
terminate
in class GenericUDAFEvaluator
- Returns:
- final aggregation result.
- Throws:
HiveException
setResult
public void setResult(DoubleWritable result)
getResult
public DoubleWritable getResult()
Copyright © 2014 The Apache Software Foundation. All rights reserved.