Configuration changes between MRv1 and MRv2

Reviewing the differences between MapReduce version 1 (MRv1) and YARN/MapReduce version 2 (MRv2) helps you to understand the changes to the configuration parameters that have replaced the deprecated ones.

JobTracker Properties and ResourceManager Equivalents

MRv1 YARN / MRv2
mapred.jobtracker.taskScheduler
yarn.resourcemanager.scheduler.class
mapred.jobtracker.completeuserjobs.maximum
yarn.resourcemanager.max-completed-applications
mapred.jobtracker.restart.recover
yarn.resourcemanager.recovery.enabled
mapred.job.tracker
yarn.resourcemanager.hostname
or all of the following:
yarn.resourcemanager.address 
yarn.resourcemanager.scheduler.address 
yarn.resourcemanager.resource-tracker.address
yarn.resourcemanager.admin.address
mapred.job.tracker.http.address
yarn.resourcemanager.webapp.address
or
yarn.resourcemanager.hostname
mapred.job.tracker.handler.count
yarn.resourcemanager.resource-tracker.client.thread-count
mapred.hosts
yarn.resourcemanager.nodes.include-path
mapred.hosts.exclude
yarn.resourcemanager.nodes.exclude-path
mapred.cluster.max.map.memory.mb
yarn.scheduler.maximum-allocation-mb
mapred.cluster.max.reduce.memory.mb
yarn.scheduler.maximum-allocation-mb
mapred.acls.enabled
yarn.acl.enable
mapreduce.cluster.acls.enabled
yarn.acl.enable

JobTracker Properties and JobHistoryServer Equivalents

MRv1 YARN / MRv2 Comment
mapred.job.tracker.retiredjobs.cache.size
mapreduce.jobhistory.joblist.cache.size
mapred.job.tracker.jobhistory.lru.cache.size
mapreduce.jobhistory.loadedjobs.cache.size
mapred.job.tracker.history.completed.location
mapreduce.jobhistory.done-dir
Local FS in MR1; stored in HDFS in MR2
hadoop.job.history.user.location
mapreduce.jobhistory.done-dir
hadoop.job.history.location
mapreduce.jobhistory.done-dir

JobTracker Properties and MapReduce ApplicationMaster Equivalents

MRv1 YARN / MRv2 Comment
mapreduce.jobtracker.staging.root.dir
yarn.app.mapreduce.am.staging-dir
Now configurable per job

TaskTracker Properties and NodeManager Equivalents

MRv1 YARN / MRv2
mapred.tasktracker.map.tasks.maximum
yarn.nodemanager.resource.memory-mb
and
yarn.nodemanager.resource.cpu-vcores
mapred.tasktracker.reduce.tasks.maximum
yarn.nodemanager.resource.memory-mb
and
yarn.nodemanager.resource.cpu-vcores
mapred.tasktracker.expiry.interval
yarn.nm.liveliness-monitor.expiry-interval-ms
mapred.tasktracker.resourcecalculatorplugin
yarn.nodemanager.container-monitor.resource-calculator.class
mapred.tasktracker.taskmemorymanager.monitoring-interval
yarn.nodemanager.container-monitor.interval-ms
mapred.tasktracker.tasks.sleeptime-before-sigkill
yarn.nodemanager.sleep-delay-before-sigkill.ms
mapred.task.tracker.task-controller
yarn.nodemanager.container-executor.class
mapred.local.dir
yarn.nodemanager.local-dirs
mapreduce.cluster.local.dir
yarn.nodemanager.local-dirs
mapred.disk.healthChecker.interval
yarn.nodemanager.disk-health-checker.interval-ms
mapred.healthChecker.script.path
yarn.nodemanager.health-checker.script.path
mapred.healthChecker.interval
yarn.nodemanager.health-checker.interval-ms
mapred.healthChecker.script.timeout
yarn.nodemanager.health-checker.script.timeout-ms
mapred.healthChecker.script.args
yarn.nodemanager.health-checker.script.opts
local.cache.size
yarn.nodemanager.localizer.cache.target-size-mb
mapreduce.tasktracker.cache.local.size
yarn.nodemanager.localizer.cache.target-size-mb

TaskTracker Properties and Shuffle Service Equivalents

The table that follows shows TaskTracker properties and their equivalents in the auxiliary shuffle service that runs inside NodeManagers.

MRv1 YARN / MRv2
tasktracker.http.threads
mapreduce.shuffle.max.threads
mapred.task.tracker.http.address
mapreduce.shuffle.port
mapred.tasktracker.indexcache.mb
mapred.tasktracker.indexcache.mb

Per-Job Configuration Properties

Many of these properties have new names in MRv2, but the MRv1 names will work for all properties except mapred.job.restart.recover.

MRv1 YARN / MRv2 Comment
io.sort.mb
mapreduce.task.io.sort.mb
MRv1 name still works
io.sort.factor
mapreduce.task.io.sort.factor
MRv1 name still works
io.sort.spill.percent
mapreduce.task.io.sort.spill.percent
MRv1 name still works
mapred.map.tasks
mapreduce.job.maps
MRv1 name still works
mapred.reduce.tasks
mapreduce.job.reduces
MRv1 name still works
mapred.job.map.memory.mb
mapreduce.map.memory.mb
MRv1 name still works
mapred.job.reduce.memory.mb
mapreduce.reduce.memory.mb
MRv1 name still works
mapred.map.child.log.level
mapreduce.map.log.level
MRv1 name still works
mapred.reduce.child.log.level
mapreduce.reduce.log.level
MRv1 name still works
mapred.inmem.merge.threshold
mapreduce.reduce.shuffle.merge.inmem.threshold
MRv1 name still works
mapred.job.shuffle.merge.percent
mapreduce.reduce.shuffle.merge.percent
MRv1 name still works
mapred.job.shuffle.input.buffer.percent
mapreduce.reduce.shuffle.input.buffer.percent
MRv1 name still works
mapred.job.reduce.input.buffer.percent
mapreduce.reduce.input.buffer.percent
MRv1 name still works
mapred.map.tasks.speculative.execution
mapreduce.map.speculative
Old one still works
mapred.reduce.tasks.speculative.execution
mapreduce.reduce.speculative
MRv1 name still works
mapred.min.split.size
mapreduce.input.fileinputformat.split.minsize
MRv1 name still works
keep.failed.task.files
mapreduce.task.files.preserve.failedtasks
MRv1 name still works
mapred.output.compress
mapreduce.output.fileoutputformat.compress
MRv1 name still works
mapred.map.output.compression.codec
mapreduce.map.output.compress.codec
MRv1 name still works
mapred.compress.map.output
mapreduce.map.output.compress
MRv1 name still works
mapred.output.compression.type
mapreduce.output.fileoutputformat.compress.type
MRv1 name still works
mapred.userlog.limit.kb
mapreduce.task.userlog.limit.kb
MRv1 name still works
jobclient.output.filter
mapreduce.client.output.filter
MRv1 name still works
jobclient.completion.poll.interval
mapreduce.client.completion.pollinterval
MRv1 name still works
jobclient.progress.monitor.poll.interval
mapreduce.client.progressmonitor.pollinterval
MRv1 name still works
mapred.task.profile
mapreduce.task.profile
MRv1 name still works
mapred.task.profile.maps
mapreduce.task.profile.maps
MRv1 name still works
mapred.task.profile.reduces
mapreduce.task.profile.reduces
MRv1 name still works
mapred.line.input.format.linespermap
mapreduce.input.lineinputformat.linespermap
MRv1 name still works
mapred.skip.attempts.to.start.skipping
mapreduce.task.skip.start.attempts
MRv1 name still works
mapred.skip.map.auto.incr.proc.count
mapreduce.map.skip.proc.count.autoincr
MRv1 name still works
mapred.skip.reduce.auto.incr.proc.count
mapreduce.reduce.skip.proc.count.autoincr
MRv1 name still works
mapred.skip.out.dir
mapreduce.job.skip.outdir
MRv1 name still works
mapred.skip.map.max.skip.records
mapreduce.map.skip.maxrecords
MRv1 name still works
mapred.skip.reduce.max.skip.groups
mapreduce.reduce.skip.maxgroups
MRv1 name still works
job.end.retry.attempts
mapreduce.job.end-notification.retry.attempts
MRv1 name still works
job.end.retry.interval
mapreduce.job.end-notification.retry.interval
MRv1 name still works
job.end.notification.url
mapreduce.job.end-notification.url
MRv1 name still works
mapred.merge.recordsBeforeProgress
mapreduce.task.merge.progress.records
MRv1 name still works
mapred.job.queue.name
mapreduce.job.queuename
MRv1 name still works
mapred.reduce.slowstart.completed.maps
mapreduce.job.reduce.slowstart.completedmaps
MRv1 name still works
mapred.map.max.attempts
mapreduce.map.maxattempts
MRv1 name still works
mapred.reduce.max.attempts
mapreduce.reduce.maxattempts
MRv1 name still works
mapred.reduce.parallel.copies
mapreduce.reduce.shuffle.parallelcopies
MRv1 name still works
mapred.task.timeout
mapreduce.task.timeout
MRv1 name still works
mapred.max.tracker.failures
mapreduce.job.maxtaskfailures.per.tracker
MRv1 name still works
mapred.job.restart.recover
mapreduce.am.max-attempts
mapred.combine.recordsBeforeProgress
mapreduce.task.combine.progress.records
MRv1 name should still work - see MAPREDUCE-5130

Security Properties

MRv1 MRv2
security.task.umbilical.protocol.acl security.job.task.protocol.acl
security.inter.tracker.protocol.acl security.resourcetracker.protocol.acl
security.job.submission.protocol.acl security.applicationclient.protocol.acl
security.admin.operations.protocol.acl security.resourcemanager-administration.protocol.acl

High Availability Properties

MRv1 YARN / MRv2
mapred.jobtrackers.name
yarn.resourcemanager.ha.rm-ids
mapred.ha.jobtracker.id
yarn.resourcemanager.ha.id
mapred.jobtracker.rpc-address.name.id
(See Configure YARN ResourceManager High Availability documentation.)
mapred.ha.jobtracker.rpc-address.name.id
yarn.resourcemanager.ha.admin.address
mapred.ha.fencing.methods
yarn.resourcemanager.ha.fencer
mapred.client.failover.*
None
yarn.resourcemanager.ha.enabled
mapred.jobtracker.restart.recover
yarn.resourcemanager.recovery.enabled
yarn.resourcemanager.store.class
mapred.ha.automatic-failover.enabled
yarn.resourcemanager.ha.automatic-failover.enabled
mapred.ha.zkfc.port
yarn.resourcemanager.ha.automatic-failover.port
mapred.job.tracker
yarn.resourcemanager.cluster.id

Miscellaneous Properties

MRv1 YARN / MRv2
mapred.heartbeats.in.second
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms
mapred.userlog.retain.hours
yarn.log-aggregation.retain-seconds

MRv1 Properties that have no MRv2 Equivalents

MRv1 Comment
mapreduce.tasktracker.group
mapred.child.ulimit
mapred.tasktracker.dns.interface
mapred.tasktracker.dns.nameserver
mapred.tasktracker.instrumentation
NodeManager does not accept instrumentation
mapred.job.reuse.jvm.num.tasks
JVM reuse no longer supported
mapreduce.job.jvm.numtasks
JVM reuse no longer supported
mapred.task.tracker.report.address
No need for this, as containers do not use IPC with NodeManagers, and ApplicationMaster ports are chosen at runtime
mapreduce.task.tmp.dir
No longer configurable. Now always tmp/ (under container's local dir)
mapred.child.tmp
No longer configurable. Now always tmp/ (under container's local dir)
mapred.temp.dir
mapred.jobtracker.instrumentation
ResourceManager does not accept instrumentation
mapred.jobtracker.plugins
ResourceManager does not accept plugins
mapred.task.cache.level
mapred.queue.names
These go in the scheduler-specific configuration files
mapred.system.dir
mapreduce.tasktracker.cache.local.numberdirectories
mapreduce.reduce.input.limit
io.sort.record.percent
Tuned automatically (MAPREDUCE-64)
mapred.cluster.map.memory.mb
Not necessary; MRv2 uses resources instead of slots
mapred.cluster.reduce.memory.mb
Not necessary; MRv2 uses resources instead of slots
mapred.max.tracker.blacklists
mapred.jobtracker.maxtasks.per.job
Related configurations go in scheduler-specific configuration files
mapred.jobtracker.taskScheduler.maxRunningTasksPerJob
Related configurations go in scheduler-specific configuration files
io.map.index.skip
mapred.user.jobconf.limit
mapred.local.dir.minspacestart
mapred.local.dir.minspacekill
hadoop.rpc.socket.factory.class.JobSubmissionProtocol
mapreduce.tasktracker.outofband.heartbeat
Always on
mapred.jobtracker.job.history.block.size
security.applicationmaster.protocol.acl
security.containermanagement.protocol.acl
security.resourcelocalizer.protocol.acl
security.job.client.protocol.acl
yarn.resourcemanager.ha.enabled