The MapReduce Service
For an overview of computation frameworks, and their usage and restrictions, and common tasks, see Managing MapReduce and YARN.
Configuring the MapReduce Scheduler
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
The MapReduce service is configured by default to use the FairScheduler. You can change the scheduler type to FIFO or Capacity Scheduler. You can also modify the Fair Scheduler and Capacity Scheduler configuration. For further information on schedulers, see Schedulers.
Configuring the Task Scheduler Type
- Go to the MapReduce service.
- Click the Configuration tab.
- Expand the JobTracker Default Group category and click the Classes category.
- Click the Value field of the Task Scheduler row and select a scheduler.
- Click Save Changes to commit the changes.
- Restart the JobTracker to apply the new configuration:
- Click the Instances tab.
- Click the JobTracker role.
- Select .
Modifying the Scheduler Configuration
- Go to the MapReduce service.
- Click the Configuration tab.
- Click the Jobs subcategory of the JobTracker Default Group category.
- Click a property and modify the configuration.
- Click Save Changes to commit the changes.
- Restart the JobTracker to apply the new configuration:
- Click the Instances tab.
- Click the JobTracker role.
- Select .
Configuring the MapReduce Service to Save Job History
Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)
Normally job history is saved on the host on which the JobTracker is running. You can configure JobTracker to write information about every job that completes to a specified HDFS location. By default, the information is retained for 7 days.
Enabling Map Reduce Job History To Be Saved to HDFS
- Create a folder in HDFS to contain the history information. When creating the folder, set the owner and group to mapred:hadoop with permission setting 775.
- Go to the MapReduce service.
- Click the Configuration tab.
- Expand the JobTracker Default Group category and click the Paths subcategory.
- Set the Completed Job History Location property to the location that you created in step 1.
- Click Save Changes.
- Restart the MapReduce service.
Setting the Job History Retention Duration
- Select the JobTracker Default Group category.
- Set the Job History Files Maximum Age property (mapreduce.jobhistory.max-age-ms) to the length of time (in milliseconds, seconds, minutes, or hours) that you want job history files to be kept.
- Restart the MapReduce service.
- Select the JobTracker Default Group category.
- Set the Job History Files Cleaner Interval property (mapreduce.jobhistory.cleaner.interval) to the desired frequency (in milliseconds, seconds, minutes, or hours).
- Restart the MapReduce service.
Configuring Client Overrides
A configuration property qualified with (Client Override) is a server-side setting that ignores any value a client tries to set for that property. It performs the same role as its unqualified counterpart, and applies the configuration to the service with the setting <final>true</final>.
For example, if you set the Map task heap property to 1 GB in the job configuration code, but the service's heap property qualified with (Client Override) is set to 500 MB, then 500 MB is applied.