Configuration changes between MRv1 and MRv2
Reviewing the differences between MapReduce version 1 (MRv1) and YARN/MapReduce version 2 (MRv2) helps you to understand the changes to the configuration parameters that have replaced the deprecated ones.
JobTracker Properties and ResourceManager Equivalents
MRv1 | YARN / MRv2 |
---|---|
mapred.jobtracker.taskScheduler |
yarn.resourcemanager.scheduler.class |
mapred.jobtracker.completeuserjobs.maximum |
yarn.resourcemanager.max-completed-applications |
mapred.jobtracker.restart.recover |
yarn.resourcemanager.recovery.enabled |
mapred.job.tracker |
yarn.resourcemanager.hostnameor all of the following: yarn.resourcemanager.address yarn.resourcemanager.scheduler.address yarn.resourcemanager.resource-tracker.address yarn.resourcemanager.admin.address |
mapred.job.tracker.http.address |
yarn.resourcemanager.webapp.addressor yarn.resourcemanager.hostname |
mapred.job.tracker.handler.count |
yarn.resourcemanager.resource-tracker.client.thread-count |
mapred.hosts |
yarn.resourcemanager.nodes.include-path |
mapred.hosts.exclude |
yarn.resourcemanager.nodes.exclude-path |
mapred.cluster.max.map.memory.mb |
yarn.scheduler.maximum-allocation-mb |
mapred.cluster.max.reduce.memory.mb |
yarn.scheduler.maximum-allocation-mb |
mapred.acls.enabled |
yarn.acl.enable |
mapreduce.cluster.acls.enabled |
yarn.acl.enable |
JobTracker Properties and JobHistoryServer Equivalents
MRv1 | YARN / MRv2 | Comment |
---|---|---|
mapred.job.tracker.retiredjobs.cache.size |
mapreduce.jobhistory.joblist.cache.size |
|
mapred.job.tracker.jobhistory.lru.cache.size |
mapreduce.jobhistory.loadedjobs.cache.size |
|
mapred.job.tracker.history.completed.location |
mapreduce.jobhistory.done-dir |
Local FS in MR1; stored in HDFS in MR2 |
hadoop.job.history.user.location |
mapreduce.jobhistory.done-dir |
|
hadoop.job.history.location |
mapreduce.jobhistory.done-dir |
JobTracker Properties and MapReduce ApplicationMaster Equivalents
MRv1 | YARN / MRv2 | Comment |
---|---|---|
mapreduce.jobtracker.staging.root.dir |
yarn.app.mapreduce.am.staging-dir |
Now configurable per job |
TaskTracker Properties and NodeManager Equivalents
MRv1 | YARN / MRv2 |
---|---|
mapred.tasktracker.map.tasks.maximum |
yarn.nodemanager.resource.memory-mband yarn.nodemanager.resource.cpu-vcores |
mapred.tasktracker.reduce.tasks.maximum |
yarn.nodemanager.resource.memory-mband yarn.nodemanager.resource.cpu-vcores |
mapred.tasktracker.expiry.interval |
yarn.nm.liveliness-monitor.expiry-interval-ms |
mapred.tasktracker.resourcecalculatorplugin |
yarn.nodemanager.container-monitor.resource-calculator.class |
mapred.tasktracker.taskmemorymanager.monitoring-interval |
yarn.nodemanager.container-monitor.interval-ms |
mapred.tasktracker.tasks.sleeptime-before-sigkill |
yarn.nodemanager.sleep-delay-before-sigkill.ms |
mapred.task.tracker.task-controller |
yarn.nodemanager.container-executor.class |
mapred.local.dir |
yarn.nodemanager.local-dirs |
mapreduce.cluster.local.dir |
yarn.nodemanager.local-dirs |
mapred.disk.healthChecker.interval |
yarn.nodemanager.disk-health-checker.interval-ms |
mapred.healthChecker.script.path |
yarn.nodemanager.health-checker.script.path |
mapred.healthChecker.interval |
yarn.nodemanager.health-checker.interval-ms |
mapred.healthChecker.script.timeout |
yarn.nodemanager.health-checker.script.timeout-ms |
mapred.healthChecker.script.args |
yarn.nodemanager.health-checker.script.opts |
local.cache.size |
yarn.nodemanager.localizer.cache.target-size-mb |
mapreduce.tasktracker.cache.local.size |
yarn.nodemanager.localizer.cache.target-size-mb |
TaskTracker Properties and Shuffle Service Equivalents
The table that follows shows TaskTracker properties and their equivalents in the auxiliary shuffle service that runs inside NodeManagers.
MRv1 | YARN / MRv2 |
---|---|
tasktracker.http.threads |
mapreduce.shuffle.max.threads |
mapred.task.tracker.http.address |
mapreduce.shuffle.port |
mapred.tasktracker.indexcache.mb |
mapred.tasktracker.indexcache.mb |
Per-Job Configuration Properties
Many of these properties have new names in MRv2, but the MRv1 names
will work for all properties except
mapred.job.restart.recover
.
MRv1 | YARN / MRv2 | Comment |
---|---|---|
io.sort.mb |
mapreduce.task.io.sort.mb |
MRv1 name still works |
io.sort.factor |
mapreduce.task.io.sort.factor |
MRv1 name still works |
io.sort.spill.percent |
mapreduce.task.io.sort.spill.percent |
MRv1 name still works |
mapred.map.tasks |
mapreduce.job.maps |
MRv1 name still works |
mapred.reduce.tasks |
mapreduce.job.reduces |
MRv1 name still works |
mapred.job.map.memory.mb |
mapreduce.map.memory.mb |
MRv1 name still works |
mapred.job.reduce.memory.mb |
mapreduce.reduce.memory.mb |
MRv1 name still works |
mapred.map.child.log.level |
mapreduce.map.log.level |
MRv1 name still works |
mapred.reduce.child.log.level |
mapreduce.reduce.log.level |
MRv1 name still works |
mapred.inmem.merge.threshold |
mapreduce.reduce.shuffle.merge.inmem.threshold |
MRv1 name still works |
mapred.job.shuffle.merge.percent |
mapreduce.reduce.shuffle.merge.percent |
MRv1 name still works |
mapred.job.shuffle.input.buffer.percent |
mapreduce.reduce.shuffle.input.buffer.percent |
MRv1 name still works |
mapred.job.reduce.input.buffer.percent |
mapreduce.reduce.input.buffer.percent |
MRv1 name still works |
mapred.map.tasks.speculative.execution |
mapreduce.map.speculative |
Old one still works |
mapred.reduce.tasks.speculative.execution |
mapreduce.reduce.speculative |
MRv1 name still works |
mapred.min.split.size |
mapreduce.input.fileinputformat.split.minsize |
MRv1 name still works |
keep.failed.task.files |
mapreduce.task.files.preserve.failedtasks |
MRv1 name still works |
mapred.output.compress |
mapreduce.output.fileoutputformat.compress |
MRv1 name still works |
mapred.map.output.compression.codec |
mapreduce.map.output.compress.codec |
MRv1 name still works |
mapred.compress.map.output |
mapreduce.map.output.compress |
MRv1 name still works |
mapred.output.compression.type |
mapreduce.output.fileoutputformat.compress.type |
MRv1 name still works |
mapred.userlog.limit.kb |
mapreduce.task.userlog.limit.kb |
MRv1 name still works |
jobclient.output.filter |
mapreduce.client.output.filter |
MRv1 name still works |
jobclient.completion.poll.interval |
mapreduce.client.completion.pollinterval |
MRv1 name still works |
jobclient.progress.monitor.poll.interval |
mapreduce.client.progressmonitor.pollinterval |
MRv1 name still works |
mapred.task.profile |
mapreduce.task.profile |
MRv1 name still works |
mapred.task.profile.maps |
mapreduce.task.profile.maps |
MRv1 name still works |
mapred.task.profile.reduces |
mapreduce.task.profile.reduces |
MRv1 name still works |
mapred.line.input.format.linespermap |
mapreduce.input.lineinputformat.linespermap |
MRv1 name still works |
mapred.skip.attempts.to.start.skipping |
mapreduce.task.skip.start.attempts |
MRv1 name still works |
mapred.skip.map.auto.incr.proc.count |
mapreduce.map.skip.proc.count.autoincr |
MRv1 name still works |
mapred.skip.reduce.auto.incr.proc.count |
mapreduce.reduce.skip.proc.count.autoincr |
MRv1 name still works |
mapred.skip.out.dir |
mapreduce.job.skip.outdir |
MRv1 name still works |
mapred.skip.map.max.skip.records |
mapreduce.map.skip.maxrecords |
MRv1 name still works |
mapred.skip.reduce.max.skip.groups |
mapreduce.reduce.skip.maxgroups |
MRv1 name still works |
job.end.retry.attempts |
mapreduce.job.end-notification.retry.attempts |
MRv1 name still works |
job.end.retry.interval |
mapreduce.job.end-notification.retry.interval |
MRv1 name still works |
job.end.notification.url |
mapreduce.job.end-notification.url |
MRv1 name still works |
mapred.merge.recordsBeforeProgress |
mapreduce.task.merge.progress.records |
MRv1 name still works |
mapred.job.queue.name |
mapreduce.job.queuename |
MRv1 name still works |
mapred.reduce.slowstart.completed.maps |
mapreduce.job.reduce.slowstart.completedmaps |
MRv1 name still works |
mapred.map.max.attempts |
mapreduce.map.maxattempts |
MRv1 name still works |
mapred.reduce.max.attempts |
mapreduce.reduce.maxattempts |
MRv1 name still works |
mapred.reduce.parallel.copies |
mapreduce.reduce.shuffle.parallelcopies |
MRv1 name still works |
mapred.task.timeout |
mapreduce.task.timeout |
MRv1 name still works |
mapred.max.tracker.failures |
mapreduce.job.maxtaskfailures.per.tracker |
MRv1 name still works |
mapred.job.restart.recover |
mapreduce.am.max-attempts |
|
mapred.combine.recordsBeforeProgress |
mapreduce.task.combine.progress.records |
MRv1 name should still work - see MAPREDUCE-5130 |
Security Properties
MRv1 | MRv2 |
---|---|
security.task.umbilical.protocol.acl |
security.job.task.protocol.acl |
security.inter.tracker.protocol.acl |
security.resourcetracker.protocol.acl |
security.job.submission.protocol.acl |
security.applicationclient.protocol.acl |
security.admin.operations.protocol.acl |
security.resourcemanager-administration.protocol.acl |
High Availability Properties
MRv1 | YARN / MRv2 |
---|---|
mapred.jobtrackers.name |
yarn.resourcemanager.ha.rm-ids |
mapred.ha.jobtracker.id |
yarn.resourcemanager.ha.id |
mapred.jobtracker.rpc-address.name.id |
(See Configure YARN ResourceManager High Availability documentation.) |
mapred.ha.jobtracker.rpc-address.name.id |
yarn.resourcemanager.ha.admin.address |
mapred.ha.fencing.methods |
yarn.resourcemanager.ha.fencer |
mapred.client.failover.* |
None |
yarn.resourcemanager.ha.enabled |
|
mapred.jobtracker.restart.recover |
yarn.resourcemanager.recovery.enabled |
yarn.resourcemanager.store.class |
|
mapred.ha.automatic-failover.enabled |
yarn.resourcemanager.ha.automatic-failover.enabled |
mapred.ha.zkfc.port |
yarn.resourcemanager.ha.automatic-failover.port |
mapred.job.tracker |
yarn.resourcemanager.cluster.id |
Miscellaneous Properties
MRv1 | YARN / MRv2 |
---|---|
mapred.heartbeats.in.second |
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms |
mapred.userlog.retain.hours |
yarn.log-aggregation.retain-seconds |
MRv1 Properties that have no MRv2 Equivalents
MRv1 | Comment |
---|---|
mapreduce.tasktracker.group |
|
mapred.child.ulimit |
|
mapred.tasktracker.dns.interface |
|
mapred.tasktracker.dns.nameserver |
|
mapred.tasktracker.instrumentation |
NodeManager does not accept instrumentation |
mapred.job.reuse.jvm.num.tasks |
JVM reuse no longer supported |
mapreduce.job.jvm.numtasks |
JVM reuse no longer supported |
mapred.task.tracker.report.address |
No need for this, as containers do not use IPC with NodeManagers, and ApplicationMaster ports are chosen at runtime |
mapreduce.task.tmp.dir |
No longer configurable. Now always tmp/ (under container's
local dir) |
mapred.child.tmp |
No longer configurable. Now always tmp/ (under container's
local dir) |
mapred.temp.dir |
|
mapred.jobtracker.instrumentation |
ResourceManager does not accept instrumentation |
mapred.jobtracker.plugins |
ResourceManager does not accept plugins |
mapred.task.cache.level |
|
mapred.queue.names |
These go in the scheduler-specific configuration files |
mapred.system.dir |
|
mapreduce.tasktracker.cache.local.numberdirectories |
|
mapreduce.reduce.input.limit |
|
io.sort.record.percent |
Tuned automatically (MAPREDUCE-64) |
mapred.cluster.map.memory.mb |
Not necessary; MRv2 uses resources instead of slots |
mapred.cluster.reduce.memory.mb |
Not necessary; MRv2 uses resources instead of slots |
mapred.max.tracker.blacklists |
|
mapred.jobtracker.maxtasks.per.job |
Related configurations go in scheduler-specific configuration files |
mapred.jobtracker.taskScheduler.maxRunningTasksPerJob |
Related configurations go in scheduler-specific configuration files |
io.map.index.skip |
|
mapred.user.jobconf.limit |
|
mapred.local.dir.minspacestart |
|
mapred.local.dir.minspacekill |
|
hadoop.rpc.socket.factory.class.JobSubmissionProtocol |
|
mapreduce.tasktracker.outofband.heartbeat |
Always on |
mapred.jobtracker.job.history.block.size |
|
security.applicationmaster.protocol.acl |
|
security.containermanagement.protocol.acl |
|
security.resourcelocalizer.protocol.acl |
|
security.job.client.protocol.acl |
|
yarn.resourcemanager.ha.enabled |