3.1.9. Uber Jobs (Technical Preview)

[Note]Note

This feature is a technical preview and considered under development. Do not use this feature in your production systems. If you have questions regarding this feature, contact Support by logging a case on our Hortonworks Support Portal at https://support.hortonworks.com.

An Uber Job is when multiple mappers and reducers are combined to use a single Container. There are four core settings around the configuration of Uber Jobs found in the mapred-site.xml options presented in the following table.

Configuration options for Uber Jobs

Property Description
mapreduce.job.ubertask.enable

Whether to enable the small-jobs "ubertask" optimization, which runs "sufficiently small" jobs sequentially within a single JVM. "Small" is defined by the following maxmaps, maxreduces, and maxbytes settings. Users can override this value.

Default = false

mapreduce.job.ubertask.maxmaps

The threshold for the number of maps beyond which a job is considered too large for the ubertasking optimization. Users can override this value, but only downward.

Default = 9

mapreduce.job.ubertask.maxreduces

The threshold for the number of reduces beyond which a job is considered too large for the ubertasking optimization. CURRENTLY THE CODE CANNOT SUPPORT MORE THAN ONE REDUCE and will ignore larger values (zero is a valid maximum value, however). Users can override this value, but only downward.

Default = 1

mapreduce.job.ubertask.maxbytes

The threshold for the number of input bytes beyond which a job is considered too large for the ubertasking optimization. If no value is specified, dfs.block.size is used as the default. Be sure to specify a default value in mapred-site.xml if the underlying file system is not HDFS. Users can override this value, but only downward.

Default = HDFS Block Size