YARN (MR2 Included) Properties in CDH 4.1.0

gateway

Advanced

Display Name Description Related Name Default Value API Name Required
Deploy Directory The directory where the client configs will be deployed /etc/hadoop client_config_root_dir true
Gateway Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
MapReduce Client Advanced Configuration Snippet (Safety Valve) for mapred-site.xml For advanced use only, a string to be inserted into the client configuration for mapred-site.xml. mapreduce_client_config_safety_valve false
Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh For advanced use only, key-value pairs (one on each line) to be inserted into the client configuration for hadoop-env.sh mapreduce_client_env_safety_valve false
Client Java Configuration Options These are Java command-line arguments. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. -Djava.net.preferIPv4Stack=true mapreduce_client_java_opts false
YARN Client Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only, a string to be inserted into the client configuration for yarn-site.xml. yarn_client_config_safety_valve false

Compression

Display Name Description Related Name Default Value API Name Required
Compression Level of Codecs Compression level for the codec used to compress MapReduce outputs. Default compression is a balance between speed and compression ratio. zlib.compress.level DEFAULT_COMPRESSION zlib_compress_level false

Logs

Display Name Description Related Name Default Value API Name Required
Gateway Logging Threshold The minimum log level for Gateway logs INFO log_threshold false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Log Event Capture When set, each role identifies important log events and forwards them to Cloudera Manager. true catch_events false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false

Other

Display Name Description Related Name Default Value API Name Required
Alternatives Priority The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will cause Alternatives to prefer this configuration over any others. 91 client_config_priority true
Running Job History Location Location to store the job history files of running jobs. This is a path on the host where the JobTracker is running. hadoop.job.history.location /var/log/hadoop-mapreduce/history hadoop_job_history_dir false
SequenceFile I/O Buffer Size Size of buffer for read and write operations of SequenceFiles. io.file.buffer.size 64 KiB io_file_buffer_size false
I/O Sort Factor The number of streams to merge at the same time while sorting files. That is, the number of sort heads to use during the merge sort on the reducer side. This determines the number of open file handles. Merging more files in parallel reduces merge sort iterations and improves run time by eliminating disk I/O. Note that merging more files in parallel uses more memory. If 'io.sort.factor' is set too high or the maximum JVM heap is set too low, excessive garbage collection will occur. The Hadoop default is 10, but Cloudera recommends a higher value. Will be part of generated client configuration. mapreduce.task.io.sort.factor 64 io_sort_factor false
I/O Sort Memory Buffer (MiB) The total amount of memory buffer, in megabytes, to use while sorting files. Note that this memory comes out of the user JVM heap size (meaning total user JVM heap - this amount of memory = total user usable heap space. Note that Cloudera's default differs from Hadoop's default; Cloudera uses a bigger buffer by default because modern machines often have more RAM. The smallest value across all TaskTrackers will be part of generated client configuration. mapreduce.task.io.sort.mb 256 MiB io_sort_mb false
I/O Sort Spill Percent The soft limit in either the buffer or record collection buffers. When this limit is reached, a thread will begin to spill the contents to disk in the background. Note that this does not imply any chunking of data to the spill. A value less than 0.5 is not recommended. The syntax is in decimal units; the default is 80% and is formatted 0.8. Will be part of generated client configuration. mapreduce.map.sort.spill.percent 0.8 io_sort_spill_percent false
Use Compression on Map Outputs If enabled, uses compression on the map outputs before they are sent across the network. Will be part of generated client configuration. mapreduce.map.output.compress true mapred_compress_map_output false
Compression Codec of MapReduce Map Output For MapReduce map outputs that are compressed, specify the compression codec to use. Will be part of generated client configuration. mapreduce.map.output.compress.codec org.apache.hadoop.io.compress.SnappyCodec mapred_map_output_compression_codec false
Map Tasks Speculative Execution If enabled, multiple instances of some map tasks may be executed in parallel. mapreduce.map.speculative false mapred_map_tasks_speculative_execution false
Compress MapReduce Job Output Compress the output of MapReduce jobs. Will be part of generated client configuration. mapreduce.output.fileoutputformat.compress false mapred_output_compress false
Compression Codec of MapReduce Job Output For MapReduce job outputs that are compressed, specify the compression codec to use. Will be part of generated client configuration. mapreduce.output.fileoutputformat.compress.codec org.apache.hadoop.io.compress.DefaultCodec mapred_output_compression_codec false
Compression Type of MapReduce Job Output For MapReduce job outputs that are compressed as SequenceFiles, you can select one of these compression type options: NONE, RECORD or BLOCK. Cloudera recommends BLOCK. Will be part of generated client configuration. mapreduce.output.fileoutputformat.compress.type BLOCK mapred_output_compression_type false
Default Number of Parallel Transfers During Shuffle The default number of parallel transfers run by reduce during the copy (shuffle) phase. This number is calculated by the following formula: min(number_of_nodes, n * min(number_of_cores_per_node, number_of_spindles_per_node)) where the n represents how many streams you want to run per core/spindle. A value of 10 for n is appropriate in most cases. Will be part of generated client configuration. mapreduce.reduce.shuffle.parallelcopies 10 mapred_reduce_parallel_copies false
Number of Map Tasks to Complete Before Reduce Tasks Fraction of the number of map tasks in the job which should be completed before reduce tasks are scheduled for the job. mapreduce.job.reduce.slowstart.completedmaps 0.8 mapred_reduce_slowstart_completed_maps false
Default Number of Reduce Tasks per Job The default number of reduce tasks per job. Will be part of generated client configuration. mapreduce.job.reduces 1 mapred_reduce_tasks false
Reduce Tasks Speculative Execution If enabled, multiple instances of some reduce tasks may be executed in parallel. mapreduce.reduce.speculative false mapred_reduce_tasks_speculative_execution false
Mapreduce Submit Replication The replication level for submitted job files. mapreduce.client.submit.file.replication 10 mapred_submit_replication false
Mapreduce Task Timeout The number of milliseconds before a task will be terminated if it neither reads an input, writes an output, nor updates its status string. mapreduce.task.timeout 10 minute(s) mapred_task_timeout false
Maximum Number of Attempts for MapReduce Jobs The maximum number of application attempts for MapReduce jobs. The value of this parameter overrides ApplicationMaster Maximum Attempts for MapReduce jobs. mapreduce.am.max-attempts 2 mapreduce_am_max_attempts false
Application Framework The application framework to run jobs with. If not set, jobs will be run with the local job runner. mapreduce.framework.name yarn mapreduce_framework_name false
JobTracker MetaInfo Maxsize The maximum permissible size of the split metainfo file. The JobTracker won't attempt to read split metainfo files bigger than the configured value. No limits if set to -1. mapreduce.job.split.metainfo.maxsize 10000000 mapreduce_jobtracker_split_metainfo_maxsize false
Map Task Java Opts Base Java opts for the map processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc -Xloggc:/tmp/@taskid@.gc". The configuration variable 'Map Task Memory' can be used to control the maximum memory of the map processes. mapreduce.map.java.opts -Djava.net.preferIPv4Stack=true mapreduce_map_java_opts false
Reduce Task Java Opts Base Java opts for the reduce processes. The following symbol, if present, will be interpolated: @taskid@ is replaced by current TaskID. Any other occurrences of '@' will go unchanged. For example, to enable verbose gc logging to a file named for the taskid in /tmp pass a value of: "-verbose:gc -Xloggc:/tmp/@taskid@.gc". The configuration variable 'Reduce Task Memory' can be used to control the maximum memory of the reduce processes. mapreduce.reduce.java.opts -Djava.net.preferIPv4Stack=true mapreduce_reduce_java_opts false
ApplicationMaster Java Opts Base Java command line arguments passed to the MapReduce ApplicationMaster. yarn.app.mapreduce.am.command-opts -Djava.net.preferIPv4Stack=true yarn_app_mapreduce_am_command_opts false

Performance

Display Name Description Related Name Default Value API Name Required
Job Counter Groups Limit Limit on the number of counter groups allowed per job. mapreduce.job.counters.groups.max 50 mapreduce_job_counter_groups_limit false
Job Counters Limit Limit on the number of counters allowed per job. mapreduce.job.counters.max 120 mapreduce_job_counters_limit false
Enable Ubertask Optimization Whether to enable ubertask optimization, which runs "sufficiently small" jobs sequentially within a single JVM. "Small" is defined by the mapreduce.job.ubertask.maxmaps, mapreduce.job.ubertask.maxreduces, and mapreduce.job.ubertask.maxbytes settings. mapreduce.job.ubertask.enable false mapreduce_job_ubertask_enabled false
Ubertask Maximum Job Size Threshold for number of input bytes, beyond which a job is considered too big for ubertask optimization. If no value is specified, dfs.block.size is used as a default. mapreduce.job.ubertask.maxbytes mapreduce_job_ubertask_maxbytes false
Ubertask Maximum Maps Threshold for number of maps, beyond which a job is considered too big for ubertask optimization. mapreduce.job.ubertask.maxmaps 9 mapreduce_job_ubertask_maxmaps false
Ubertask Maximum Reduces Threshold for number of reduces, beyond which a job is considered too big for ubertask optimization. Note: As of CDH 5, MR2 does not support more than one reduce in an ubertask. (Zero is valid.) mapreduce.job.ubertask.maxreduces 1 mapreduce_job_ubertask_maxreduces false

Resource Management

Display Name Description Related Name Default Value API Name Required
Client Java Heap Size in Bytes Maximum size in bytes for the Java process heap memory. Passed to Java -Xmx. 825955249 B mapreduce_client_java_heapsize false
Map Task CPU Virtual Cores The number of virtual CPU cores allocated for each map task of a job. This parameter has no effect prior to CDH 4.4. mapreduce.map.cpu.vcores 1 mapreduce_map_cpu_vcores false
Map Task Maximum Heap Size The maximum Java heap size, in bytes, of the map processes. This number will be formatted and concatenated with 'Map Task Java Opts Base' to pass to Hadoop. mapreduce.map.java.opts.max.heap 825955249 B mapreduce_map_java_opts_max_heap false
Map Task Memory The amount of physical memory, in MiB, allocated for each map task of a job. For versions before CDH 5.5, if not specified, by default it is set to 1024. For CDH 5.5 and higher, a value less than 128 is not supported but if it is specified as 0, the amount of physical memory to request is inferred from Map Task Maximum Heap Size and Heap to Container Size Ratio. If Map Task Maximum Heap Size is not specified, by default the amount of physical memory to request is set to 1024. mapreduce.map.memory.mb 1 GiB mapreduce_map_memory_mb false
Reduce Task CPU Virtual Cores The number of virtual CPU cores for each reduce task of a job. mapreduce.reduce.cpu.vcores 1 mapreduce_reduce_cpu_vcores false
Reduce Task Maximum Heap Size The maximum Java heap size, in bytes, of the reduce processes. This number will be formatted and concatenated with 'Reduce Task Java Opts Base' to pass to Hadoop. mapreduce.reduce.java.opts.max.heap 825955249 B mapreduce_reduce_java_opts_max_heap false
Reduce Task Memory The amount of physical memory, in MiB, allocated for each reduce task of a job. For versions before CDH 5.5, if not specified, by default it is set to 1024. For CDH 5.5 and higher, a value less than 128 is not supported but if it is specified as 0, the amount of physical memory to request is inferred from Reduce Task Maximum Heap Size and Heap to Container Size Ratio. If Reduce Task Maximum Heap Size is not specified, by default the amount of physical memory to request is set to 1024. This parameter has no effect prior to CDH 4.4. mapreduce.reduce.memory.mb 1 GiB mapreduce_reduce_memory_mb false
ApplicationMaster Java Maximum Heap Size The maximum heap size, in bytes, of the Java MapReduce ApplicationMaster. This number will be formatted and concatenated with 'ApplicationMaster Java Opts Base' to pass to Hadoop. 825955249 B yarn_app_mapreduce_am_max_heap false
ApplicationMaster Virtual CPU Cores The virtual CPU cores requirement, for the ApplicationMaster. This parameter has no effect prior to CDH 4.4. yarn.app.mapreduce.am.resource.cpu-vcores 1 yarn_app_mapreduce_am_resource_cpu_vcores false
ApplicationMaster Memory The physical memory requirement, in MiB, for the ApplicationMaster. yarn.app.mapreduce.am.resource.mb 1 GiB yarn_app_mapreduce_am_resource_mb false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Deploy Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Deploy Directory parameter. false role_config_suppression_client_config_root_dir true
Suppress Parameter Validation: Running Job History Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Running Job History Location parameter. false role_config_suppression_hadoop_job_history_dir true
Suppress Parameter Validation: I/O Sort Factor Whether to suppress configuration warnings produced by the built-in parameter validation for the I/O Sort Factor parameter. false role_config_suppression_io_sort_factor true
Suppress Parameter Validation: Gateway Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Gateway Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Compression Codec of MapReduce Map Output Whether to suppress configuration warnings produced by the built-in parameter validation for the Compression Codec of MapReduce Map Output parameter. false role_config_suppression_mapred_map_output_compression_codec true
Suppress Parameter Validation: Compression Codec of MapReduce Job Output Whether to suppress configuration warnings produced by the built-in parameter validation for the Compression Codec of MapReduce Job Output parameter. false role_config_suppression_mapred_output_compression_codec true
Suppress Parameter Validation: Maximum Number of Attempts for MapReduce Jobs Whether to suppress configuration warnings produced by the built-in parameter validation for the Maximum Number of Attempts for MapReduce Jobs parameter. false role_config_suppression_mapreduce_am_max_attempts true
Suppress Parameter Validation: MapReduce Client Advanced Configuration Snippet (Safety Valve) for mapred-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the MapReduce Client Advanced Configuration Snippet (Safety Valve) for mapred-site.xml parameter. false role_config_suppression_mapreduce_client_config_safety_valve true
Suppress Parameter Validation: Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh Whether to suppress configuration warnings produced by the built-in parameter validation for the Gateway Client Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh parameter. false role_config_suppression_mapreduce_client_env_safety_valve true
Suppress Parameter Validation: Client Java Configuration Options Whether to suppress configuration warnings produced by the built-in parameter validation for the Client Java Configuration Options parameter. false role_config_suppression_mapreduce_client_java_opts true
Suppress Parameter Validation: Map Task Java Opts Base Whether to suppress configuration warnings produced by the built-in parameter validation for the Map Task Java Opts Base parameter. false role_config_suppression_mapreduce_map_java_opts true
Suppress Configuration Validator: Map Task Maximum Heap Size Validator Whether to suppress configuration warnings produced by the Map Task Maximum Heap Size Validator configuration validator. false role_config_suppression_mapreduce_map_java_opts_max_heap_mapreduce_map_memory_mb_validator true
Suppress Parameter Validation: Reduce Task Java Opts Base Whether to suppress configuration warnings produced by the built-in parameter validation for the Reduce Task Java Opts Base parameter. false role_config_suppression_mapreduce_reduce_java_opts true
Suppress Configuration Validator: Reduce Task Maximum Heap Size Validator Whether to suppress configuration warnings produced by the Reduce Task Maximum Heap Size Validator configuration validator. false role_config_suppression_mapreduce_reduce_java_opts_max_heap_mapreduce_reduce_memory_mb_validator true
Suppress Configuration Validator: Job Submit Replication Validator Whether to suppress configuration warnings produced by the Job Submit Replication Validator configuration validator. false role_config_suppression_mapreduce_replication_validator true
Suppress Parameter Validation: ApplicationMaster Java Opts Base Whether to suppress configuration warnings produced by the built-in parameter validation for the ApplicationMaster Java Opts Base parameter. false role_config_suppression_yarn_app_mapreduce_am_command_opts true
Suppress Configuration Validator: ApplicationMaster Java Maximum Heap Size Validator Whether to suppress configuration warnings produced by the ApplicationMaster Java Maximum Heap Size Validator configuration validator. false role_config_suppression_yarn_app_mapreduce_am_max_heap_yarn_app_mapreduce_am_resource_mb_validator true
Suppress Parameter Validation: YARN Client Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Client Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false role_config_suppression_yarn_client_config_safety_valve true

jobhistoryserver

Advanced

Display Name Description Related Name Default Value API Name Required
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. hadoop_metrics2_safety_valve false
System Group The group that the JobHistory Server process should run as. hadoop history_process_groupname true
System User The user that the JobHistory Server process should run as. mapred history_process_username true
JobHistory Server Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only. A string to be inserted into yarn-site.xml for this role only. jobhistory_config_safety_valve false
JobHistory Server Advanced Configuration Snippet (Safety Valve) for mapred-site.xml For advanced use only. A string to be inserted into mapred-site.xml for this role only. jobhistory_mapred_safety_valve false
JobHistory Server Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. JOBHISTORY_role_env_safety_valve false
JobHistory Server Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Java Configuration Options for JobHistory Server These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled mr2_jobhistory_java_opts false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. false process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Logs

Display Name Description Related Name Default Value API Name Required
JobHistory Server Logging Threshold The minimum log level for JobHistory Server logs INFO log_threshold false
JobHistory Server Maximum Log File Backups The maximum number of rolled log files to keep for JobHistory Server logs. Typically used by log4j or logback. 10 max_log_backup_index false
JobHistory Server Max Log Size The maximum size, in megabytes, per log file for JobHistory Server logs. Typically used by log4j or logback. 200 MiB max_log_size false
JobHistory Server Log Directory Directory where JobHistory Server will place its log files. hadoop.log.dir /var/log/hadoop-mapreduce mr2_jobhistory_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % jobhistory_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 jobhistory_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) jobhistory_gc_duration_window false
JobHistory Server Host Health Test When computing the overall JobHistory Server health, consider the host's health. true jobhistory_host_health_enabled false
JobHistory Server Process Health Test Enables the health test that the JobHistory Server's process state is consistent with the role configuration true jobhistory_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true jobhistory_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never jobhistory_web_metric_collection_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. Warning: Any, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Job History Files Cleaner Interval Time interval for history cleaner to check for files to delete. Files are only deleted if they are older than mapreduce.jobhistory.max-age-ms. mapreduce.jobhistory.cleaner.interval 1 day(s) mapreduce_jobhistory_cleaner_interval false
Job History Files Maximum Age Job history files older than this time duration will deleted when the history cleaner runs. mapreduce.jobhistory.max-age-ms 7 day(s) mapreduce_jobhistory_max_age_ms false
Max Shuffle Connections Maximum allowed connections for the shuffle. Set to 0 (zero) to indicate no limit on the number of connections. mapreduce.jobhistory.loadedjob.tasks.max -1 mapreduce_shuffle_max_connections false
MapReduce ApplicationMaster Staging Root Directory The root HDFS directory of the staging area for users' MR2 jobs; for example /user. The staging directories are always named after the user. yarn.app.mapreduce.am.staging-dir /user yarn_app_mapreduce_am_staging_dir false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
MapReduce JobHistory Server Port The port of the MapReduce JobHistory Server. Together with the hostname of the JobHistory role, forms the address. mapreduce.jobhistory.address 10020 mapreduce_jobhistory_address false
MapReduce JobHistory Server Admin Interface Port The port of the MapReduce JobHistory Server administrative interface. Together with the host name of the JobHistory role forms the address. mapreduce.jobhistory.admin.address 10033 mapreduce_jobhistory_admin_address false
MapReduce JobHistory Web Application HTTP Port The HTTP port of the MapReduce JobHistory Server web application. Together with the host name of the JobHistory role forms the address. mapreduce.jobhistory.webapp.address 19888 mapreduce_jobhistory_webapp_address false
MapReduce JobHistory Web Application HTTPS Port (TLS/SSL) The HTTPS port of the MapReduce JobHistory Server web application. Together with the host name of the JobHistory role forms the address. mapreduce.jobhistory.webapp.https.address 19890 mapreduce_jobhistory_webapp_https_address false
Bind JobHistory Server to Wildcard Address If enabled, the JobHistory Server binds to the wildcard address ("0.0.0.0") on all of its ports. false yarn_jobhistory_bind_wildcard false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of JobHistory Server in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB mr2_jobhistory_java_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Role-Specific Kerberos Principal Kerberos principal used by the JobHistory Server roles. mapred kerberos_role_princ_name true

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_hadoop_metrics2_safety_valve true
Suppress Parameter Validation: System Group Whether to suppress configuration warnings produced by the built-in parameter validation for the System Group parameter. false role_config_suppression_history_process_groupname true
Suppress Parameter Validation: System User Whether to suppress configuration warnings produced by the built-in parameter validation for the System User parameter. false role_config_suppression_history_process_username true
Suppress Parameter Validation: JobHistory Server Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the JobHistory Server Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false role_config_suppression_jobhistory_config_safety_valve true
Suppress Parameter Validation: JobHistory Server Advanced Configuration Snippet (Safety Valve) for mapred-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the JobHistory Server Advanced Configuration Snippet (Safety Valve) for mapred-site.xml parameter. false role_config_suppression_jobhistory_mapred_safety_valve true
Suppress Parameter Validation: JobHistory Server Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the JobHistory Server Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_jobhistory_role_env_safety_valve true
Suppress Parameter Validation: Role-Specific Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Role-Specific Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: JobHistory Server Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the JobHistory Server Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Java Configuration Options for JobHistory Server Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for JobHistory Server parameter. false role_config_suppression_mr2_jobhistory_java_opts true
Suppress Parameter Validation: JobHistory Server Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the JobHistory Server Log Directory parameter. false role_config_suppression_mr2_jobhistory_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Parameter Validation: MapReduce ApplicationMaster Staging Root Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the MapReduce ApplicationMaster Staging Root Directory parameter. false role_config_suppression_yarn_app_mapreduce_am_staging_dir true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_jobhistory_web_metric_collection true

nodemanager

Advanced

Display Name Description Related Name Default Value API Name Required
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. hadoop_metrics2_safety_valve false
CGroups Hierarchy Path (rooted in the cgroups hierarchy on the machine) where to place YARN-managed cgroups. yarn.nodemanager.linux-container-executor.cgroups.hierarchy /hadoop-yarn linux_container_executor_cgroups_hierarchy false
NodeManager Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Healthchecker Script Arguments Comma-separated list of arguments which are to be passed to node health script when it is being launched. yarn.nodemanager.health-checker.script.opts mapred_healthchecker_script_args false
Healthchecker Script Path Absolute path to the script which is periodically run by the node health monitoring service to determine if the node is healthy or not. If the value of this key is empty or the file does not exist in the location configured here, the node health monitoring service is not started. yarn.nodemanager.health-checker.script.path mapred_healthchecker_script_path false
Java Configuration Options for NodeManager These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled node_manager_java_opts false
NodeManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only. A string to be inserted into yarn-site.xml for this role only. nodemanager_config_safety_valve false
NodeManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml For advanced use only. A string to be inserted into mapred-site.xml for this role only. nodemanager_mapred_safety_valve false
NodeManager Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. NODEMANAGER_role_env_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Localized Dir Deletion Delay Number of seconds after an application finishes before the NodeManager's DeletionService will delete the application's localized file and log directory. To diagnose YARN application problems, set this property's value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. yarn.nodemanager.delete.debug-delay-sec 0 yarn_nodemanager_delete_debug_delay_sec false
Disk Health Checker Frequency Frequency, in milliseconds, of running disk health checker. yarn.nodemanager.disk-health-checker.interval-ms 2 minute(s) yarn_nodemanager_disk_health_checker_interval_ms false
Disk Health Checker Minimum Health Disks Fraction The minimum fraction of number of disks to be healthy for the NodeManager to launch new containers. This correspond to both local directories and log directories; that is, if there are fewer healthy local directories (or log directories) available, then new containers will not be launched on this node. yarn.nodemanager.disk-health-checker.min-healthy-disks 0.25 yarn_nodemanager_disk_health_checker_min_healthy_disks false

Logs

Display Name Description Related Name Default Value API Name Required
NodeManager Logging Threshold The minimum log level for NodeManager logs INFO log_threshold false
NodeManager Maximum Log File Backups The maximum number of rolled log files to keep for NodeManager logs. Typically used by log4j or logback. 10 max_log_backup_index false
NodeManager Max Log Size The maximum size, in megabytes, per log file for NodeManager logs. Typically used by log4j or logback. 200 MiB max_log_size false
NodeManager Log Directory Directory where NodeManager will place its log files. hadoop.log.dir /var/log/hadoop-yarn node_manager_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
NodeManager Connectivity Health Check Enables the health check that verifies the NodeManager is connected to the ResourceManager. true nodemanager_connectivity_health_enabled false
NodeManager Connectivity Tolerance at Startup The amount of time to wait for the NodeManager to fully start up and connect to the ResourceManager before enforcing the connectivity check. 3 minute(s) nodemanager_connectivity_tolerance_seconds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % nodemanager_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 nodemanager_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) nodemanager_gc_duration_window false
NodeManager Health Checker Health Check Enables the health check that verifies the NodeManager is seen as healthy by the ResourceManager. true nodemanager_health_checker_health_enabled false
NodeManager Host Health Test When computing the overall NodeManager health, consider the host's health. true nodemanager_host_health_enabled false
NodeManager Local Directories Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Local Directories. Warning: 10 GiB, Critical: 5 GiB nodemanager_local_data_directories_free_space_absolute_thresholds false
NodeManager Local Directories Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Local Directories. Specified as a percentage of the capacity on that filesystem. This setting is not used if a NodeManager Local Directories Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never nodemanager_local_data_directories_free_space_percentage_thresholds false
NodeManager Container Log Directories Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Container Log Directories. Warning: 10 GiB, Critical: 5 GiB nodemanager_log_directories_free_space_absolute_thresholds false
NodeManager Container Log Directories Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Container Log Directories. Specified as a percentage of the capacity on that filesystem. This setting is not used if a NodeManager Container Log Directories Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never nodemanager_log_directories_free_space_percentage_thresholds false
NodeManager Recovery Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Recovery Directory. Warning: 10 GiB, Critical: 5 GiB nodemanager_recovery_directory_free_space_absolute_thresholds false
NodeManager Recovery Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's NodeManager Recovery Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a NodeManager Recovery Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never nodemanager_recovery_directory_free_space_percentage_thresholds false
NodeManager Process Health Test Enables the health test that the NodeManager's process state is consistent with the role configuration true nodemanager_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true nodemanager_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never nodemanager_web_metric_collection_thresholds false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. Warning: Any, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Enable Shuffle Auxiliary Service If enabled, adds 'org.apache.hadoop.mapred.ShuffleHandler' to the NodeManager auxiliary services. This is required for MapReduce applications. true mapreduce_aux_service false
Max Shuffle Connections Maximum allowed connections for the shuffle. Set to 0 (zero) to indicate no limit on the number of connections. mapreduce.shuffle.max.connections 0 mapreduce_shuffle_max_connections false
Containers Environment Variable Environment variables that should be forwarded from the NodeManager's environment to the container's. yarn.nodemanager.admin-env MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX yarn_nodemanager_admin_env false
Container Manager Thread Count Number of threads container manager uses. yarn.nodemanager.container-manager.thread-count 20 yarn_nodemanager_container_manager_thread_count false
Cleanup Thread Count Number of threads used in cleanup. yarn.nodemanager.delete.thread-count 4 yarn_nodemanager_delete_thread_count false
Containers Environment Variables Whitelist Environment variables that containers may override rather than use NodeManager's default. yarn.nodemanager.env-whitelist JAVA_HOME, HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, HADOOP_CONF_DIR, YARN_HOME yarn_nodemanager_env_whitelist false
Heartbeat Interval Heartbeat interval to ResourceManager yarn.nodemanager.heartbeat.interval-ms 1 second(s) yarn_nodemanager_heartbeat_interval_ms false
NodeManager Local Directories List of directories on the local filesystem where a NodeManager stores intermediate data files. yarn.nodemanager.local-dirs yarn_nodemanager_local_dirs true
Localizer Cache Cleanup Interval Address where the localizer IPC is. yarn.nodemanager.localizer.cache.cleanup.interval-ms 10 minute(s) yarn_nodemanager_localizer_cache_cleanup_interval_ms false
Localizer Cache Target Size Target size of localizer cache in MB, per local directory. yarn.nodemanager.localizer.cache.target-size-mb 10 GiB yarn_nodemanager_localizer_cache_target_size_mb false
Localizer Client Thread Count Number of threads to handle localization requests. yarn.nodemanager.localizer.client.thread-count 5 yarn_nodemanager_localizer_client_thread_count false
Localizer Fetch Thread Count Number of threads to use for localization fetching. yarn.nodemanager.localizer.fetch.thread-count 4 yarn_nodemanager_localizer_fetch_thread_count false
NodeManager Container Log Directories List of directories on the local filesystem where a NodeManager stores container log files. yarn.nodemanager.log-dirs /var/log/hadoop-yarn/container yarn_nodemanager_log_dirs true
Log Retain Duration Time in seconds to retain user logs. Only applicable if log aggregation is disabled. yarn.nodemanager.log.retain-seconds 3 hour(s) yarn_nodemanager_log_retain_seconds false
Remote App Log Directory HDFS directory where application logs are stored when an application completes. yarn.nodemanager.remote-app-log-dir /tmp/logs yarn_nodemanager_remote_app_log_dir false
Remote App Log Directory Suffix The remote log dir will be created at {yarn.nodemanager.remote-app-log-dir}/${user}/{thisParam} yarn.nodemanager.remote-app-log-dir-suffix logs yarn_nodemanager_remote_app_log_dir_suffix false

Performance

Display Name Description Related Name Default Value API Name Required
Max Shuffle Threads Maximum allowed threads for serving shuffle connections. Set to zero to indicate the default of 2 times the number of available processors. mapreduce.shuffle.max.threads 80 mapreduce_shuffle_max_threads false
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
NodeManager Web Application HTTPS Port (TLS/SSL) The HTTPS port of the NodeManager web application. yarn.nodemanager.webapp.https.address 8044 nodemanager_webserver_https_port false
NodeManager Web Application HTTP Port The HTTP Port of the NodeManager web application. yarn.nodemanager.webapp.address 8042 nodemanager_webserver_port false
NodeManager IPC Address The address of the NodeManager IPC. yarn.nodemanager.address 8041 yarn_nodemanager_address false
Localizer Port Address where the localizer IPC is. yarn.nodemanager.localizer.address 8040 yarn_nodemanager_localizer_address false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of NodeManager in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB node_manager_java_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true
Container Virtual CPU Cores Number of virtual CPU cores that can be allocated for containers. This parameter has no effect prior to CDH 4.4. yarn.nodemanager.resource.cpu-vcores 8 yarn_nodemanager_resource_cpu_vcores true
Container Memory Amount of physical memory, in MiB, that can be allocated for containers. yarn.nodemanager.resource.memory-mb 8 GiB yarn_nodemanager_resource_memory_mb true

Security

Display Name Description Related Name Default Value API Name Required
Banned System Users List of users banned from running containers. banned.users hdfs yarn mapred bin container_executor_banned_users false
Container Executor Group The system group that owns the container-executor binary. This does not need to be changed unless the ownership of the binary is explicitly changed. yarn.nodemanager.linux-container-executor.group yarn container_executor_group false
Minimum User ID The minimum Linux user ID allowed. Used to prevent other super users. min.user.id 1000 container_executor_min_user_id false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Banned System Users Whether to suppress configuration warnings produced by the built-in parameter validation for the Banned System Users parameter. false role_config_suppression_container_executor_banned_users true
Suppress Parameter Validation: Container Executor Group Whether to suppress configuration warnings produced by the built-in parameter validation for the Container Executor Group parameter. false role_config_suppression_container_executor_group true
Suppress Parameter Validation: Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_hadoop_metrics2_safety_valve true
Suppress Parameter Validation: CGroups Hierarchy Whether to suppress configuration warnings produced by the built-in parameter validation for the CGroups Hierarchy parameter. false role_config_suppression_linux_container_executor_cgroups_hierarchy true
Suppress Parameter Validation: NodeManager Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Healthchecker Script Arguments Whether to suppress configuration warnings produced by the built-in parameter validation for the Healthchecker Script Arguments parameter. false role_config_suppression_mapred_healthchecker_script_args true
Suppress Parameter Validation: Healthchecker Script Path Whether to suppress configuration warnings produced by the built-in parameter validation for the Healthchecker Script Path parameter. false role_config_suppression_mapred_healthchecker_script_path true
Suppress Parameter Validation: Java Configuration Options for NodeManager Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for NodeManager parameter. false role_config_suppression_node_manager_java_opts true
Suppress Parameter Validation: NodeManager Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Log Directory parameter. false role_config_suppression_node_manager_log_dir true
Suppress Parameter Validation: NodeManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false role_config_suppression_nodemanager_config_safety_valve true
Suppress Parameter Validation: NodeManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml parameter. false role_config_suppression_nodemanager_mapred_safety_valve true
Suppress Parameter Validation: NodeManager Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_nodemanager_role_env_safety_valve true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Configuration Validator: Single User Mode Overrides Validator Whether to suppress configuration warnings produced by the Single User Mode Overrides Validator configuration validator. false role_config_suppression_single_user_mode_override_validator true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Parameter Validation: Containers Environment Variable Whether to suppress configuration warnings produced by the built-in parameter validation for the Containers Environment Variable parameter. false role_config_suppression_yarn_nodemanager_admin_env true
Suppress Parameter Validation: Containers Environment Variables Whitelist Whether to suppress configuration warnings produced by the built-in parameter validation for the Containers Environment Variables Whitelist parameter. false role_config_suppression_yarn_nodemanager_env_whitelist true
Suppress Parameter Validation: NodeManager Local Directories Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Local Directories parameter. false role_config_suppression_yarn_nodemanager_local_dirs true
Suppress Parameter Validation: NodeManager Container Log Directories Whether to suppress configuration warnings produced by the built-in parameter validation for the NodeManager Container Log Directories parameter. false role_config_suppression_yarn_nodemanager_log_dirs true
Suppress Parameter Validation: Remote App Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Remote App Log Directory parameter. false role_config_suppression_yarn_nodemanager_remote_app_log_dir true
Suppress Parameter Validation: Remote App Log Directory Suffix Whether to suppress configuration warnings produced by the built-in parameter validation for the Remote App Log Directory Suffix parameter. false role_config_suppression_yarn_nodemanager_remote_app_log_dir_suffix true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_audit_health true
Suppress Health Test: ResourceManager Connectivity Whether to suppress the results of the ResourceManager Connectivity heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_connectivity true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_node_manager_web_metric_collection true
Suppress Health Test: NodeManager Local Directories Free Space Whether to suppress the results of the NodeManager Local Directories Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_nodemanager_local_data_directories_free_space true
Suppress Health Test: NodeManager Container Log Directories Free Space Whether to suppress the results of the NodeManager Container Log Directories Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_nodemanager_log_directories_free_space true

resourcemanager

Advanced

Display Name Description Related Name Default Value API Name Required
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. hadoop_metrics2_safety_valve false
ResourceManager Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. false process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Java Configuration Options for ResourceManager These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled resource_manager_java_opts false
ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only. A string to be inserted into yarn-site.xml for this role only. resourcemanager_config_safety_valve false
ResourceManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml For advanced use only. A string to be inserted into mapred-site.xml for this role only. resourcemanager_mapred_safety_valve false
ResourceManager Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. RESOURCEMANAGER_role_env_safety_valve false
ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_allow.txt For advanced use only. A string to be inserted into nodes_allow.txt for this role only. rm_hosts_allow_safety_valve false
ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_exclude.txt For advanced use only. A string to be inserted into nodes_exclude.txt for this role only. rm_hosts_exclude_safety_valve false

Logs

Display Name Description Related Name Default Value API Name Required
ResourceManager Logging Threshold The minimum log level for ResourceManager logs INFO log_threshold false
ResourceManager Maximum Log File Backups The maximum number of rolled log files to keep for ResourceManager logs. Typically used by log4j or logback. 10 max_log_backup_index false
ResourceManager Max Log Size The maximum size, in megabytes, per log file for ResourceManager logs. Typically used by log4j or logback. 200 MiB max_log_size false
ResourceManager Log Directory Directory where ResourceManager will place its log files. hadoop.log.dir /var/log/hadoop-yarn resource_manager_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. Warning: Any, Critical: Never process_swap_memory_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % resourcemanager_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 resourcemanager_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) resourcemanager_gc_duration_window false
ResourceManager Host Health Test When computing the overall ResourceManager health, consider the host's health. true resourcemanager_host_health_enabled false
ResourceManager Process Health Test Enables the health test that the ResourceManager's process state is consistent with the role configuration true resourcemanager_scm_health_enabled false
Health Test Startup Tolerance The amount of time allowed after this role is started that failures of health tests that rely on communication with this role will be tolerated. 5 minute(s) resourcemanager_startup_tolerance_minutes false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true resourcemanager_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never resourcemanager_web_metric_collection_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Capacity Scheduler Configuration Advanced Configuration Snippet (Safety Valve) Enter an XML string that represents the Capacity Scheduler configuration. <configuration> <property> <name>yarn.scheduler.capacity.root.queues</name> <value>default</value> </property> <property> <name>yarn.scheduler.capacity.root.capacity</name> <value>100</value> </property> <property> <name>yarn.scheduler.capacity.root.default.capacity</name> <value>100</value> </property> </configuration> resourcemanager_capacity_scheduler_configuration true
Fair Scheduler Assign Multiple Tasks Enables multiple Fair Scheduler container assignments in one heartbeat, which improves cluster throughput when there are many small tasks to run. yarn.scheduler.fair.assignmultiple true resourcemanager_fair_scheduler_assign_multiple true
Fair Scheduler XML Advanced Configuration Snippet (Safety Valve) An XML string that will be inserted verbatim into the Fair Scheduler allocations file. For CDH5, overrides the configuration set using the Pools configuration UI. For CDH4, this is the only way to configure the Fair Scheduler for YARN. resourcemanager_fair_scheduler_configuration false
Enable Fair Scheduler Preemption When enabled, if a pool's minimum share is not met for some period of time, the Fair Scheduler preempts applications in other pools. Preemption guarantees that production applications are not starved while also allowing the cluster to be used for experimental and research applications. To minimize wasted computation, the Fair Scheduler preempts the most recently launched applications. yarn.scheduler.fair.preemption false resourcemanager_fair_scheduler_preemption true
Fair Scheduler Size-Based Weight When enabled, the Fair Scheduler will assign shares to individual apps based on their size, rather than providing an equal share to all apps regardless of size. yarn.scheduler.fair.sizebasedweight false resourcemanager_fair_scheduler_size_based_weight true
Fair Scheduler User As Default Queue When set to true, the Fair Scheduler uses the username as the default pool name, in the event that a pool name is not specified. When set to false, all applications are run in a shared pool, called default. yarn.scheduler.fair.user-as-default-queue true resourcemanager_fair_scheduler_user_as_default_queue true
ApplicationMaster Monitor Expiry The expiry interval to wait until an ApplicationMaster is considered dead. yarn.am.liveness-monitor.expiry-interval-ms 10 minute(s) yarn_am_liveness_monitor_expiry_interval_ms false
NodeManager Monitor Expiry The expiry interval to wait until a NodeManager is considered dead. yarn.nm.liveness-monitor.expiry-interval-ms 10 minute(s) yarn_nm_liveness_monitor_expiry_interval_ms false
Admin Client Thread Count Number of threads used to handle the ResourceManager admin interface. yarn.resourcemanager.admin.client.thread-count 1 yarn_resourcemanager_admin_client_thread_count false
ApplicationMaster Maximum Attempts The maximum number of application attempts. This is a global setting for all ApplicationMasters.. Each ApplicationMaster can specify its individual maximum through the API, but if the individual maximum is more than the global maximum, the ResourceManager overrides it. yarn.resourcemanager.am.max-retries 2 yarn_resourcemanager_am_max_retries false
ApplicationMaster Monitor Interval The periodic interval that the ResourceManager will check whether ApplicationMasters is still alive. yarn.resourcemanager.amliveliness-monitor.interval-ms 1 second(s) yarn_resourcemanager_amliveliness_monitor_interval_ms false
Client Thread Count The number of threads used to handle applications manager requests. yarn.resourcemanager.client.thread-count 50 yarn_resourcemanager_client_thread_count false
Container Monitor Interval The periodic interval that the ResourceManager will check whether containers are still alive. yarn.resourcemanager.container.liveness-monitor.interval-ms 10 minute(s) yarn_resourcemanager_container_liveness_monitor_interval_ms false
Max Completed Applications The maximum number of completed applications that the ResourceManager keeps. yarn.resourcemanager.max-completed-applications 10000 yarn_resourcemanager_max_completed_applications false
NodeManager Monitor Interval The periodic interval that the ResourceManager will check whether NodeManagers are still alive. yarn.resourcemanager.nm.liveness-monitor.interval-ms 1 second(s) yarn_resourcemanager_nm_liveness_monitor_interval_ms false
Enable ResourceManager Recovery When enabled, any applications that were running on the cluster when the ResourceManager died will be recovered when the ResourceManager next starts. Note: If RM-HA is enabled, then this configuration is always enabled. yarn.resourcemanager.recovery.enabled false yarn_resourcemanager_recovery_enabled false
Resource Tracker Thread Count Number of threads to handle resource tracker calls. yarn.resourcemanager.resource-tracker.client.thread-count 50 yarn_resourcemanager_resource_tracker_client_thread_count false
Scheduler Class The class to use as the resource scheduler. FairScheduler is only supported in CDH 4.2.1 and later. yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler yarn_resourcemanager_scheduler_class false
Scheduler Thread Count The number of threads used to handle requests through the scheduler interface. yarn.resourcemanager.scheduler.client.thread-count 50 yarn_resourcemanager_scheduler_client_thread_count false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
ResourceManager Web Application HTTPS Port (TLS/SSL) The HTTPS port of the ResourceManager web application. yarn.resourcemanager.webapp.https.address 8090 resourcemanager_webserver_https_port false
ResourceManager Web Application HTTP Port The HTTP port of the ResourceManager web application. yarn.resourcemanager.webapp.address 8088 resourcemanager_webserver_port false
ResourceManager Address The address of the applications manager interface in the ResourceManager. yarn.resourcemanager.address 8032 yarn_resourcemanager_address false
Administration Address The address of the admin interface in the ResourceManager. yarn.resourcemanager.admin.address 8033 yarn_resourcemanager_admin_address false
Resource Tracker Address The address of the resource tracker interface in the ResourceManager. yarn.resourcemanager.resource-tracker.address 8031 yarn_resourcemanager_resource_tracker_address false
Scheduler Address The address of the scheduler interface in the ResourceManager. yarn.resourcemanager.scheduler.address 8030 yarn_resourcemanager_scheduler_address false
Bind ResourceManager to Wildcard Address If enabled, the ResourceManager binds to the wildcard address ("0.0.0.0") on all of its ports. false yarn_rm_bind_wildcard false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of ResourceManager in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB resource_manager_java_heapsize false
Fair Scheduler Node Locality Threshold For applications that request containers on particular nodes, the number of scheduling opportunities since the last container assignment to wait before accepting a placement on another node. Expressed as a float between 0 and 1, which, as a fraction of the cluster size, is the number of scheduling opportunities to pass up. If not set, this means don't pass up any scheduling opportunities. Requires Fair Scheduler continuous scheduling to be disabled. If continuous scheduling is enabled, yarn.scheduler.fair.locality-delay-node-ms should be used instead. yarn.scheduler.fair.locality.threshold.node resourcemanager_fair_scheduler_locality_threshold_node false
Fair Scheduler Rack Locality Threshold For applications that request containers on particular racks, the number of scheduling opportunities since the last container assignment to wait before accepting a placement on another rack. Expressed as a float between 0 and 1, which, as a fraction of the cluster size, is the number of scheduling opportunities to pass up. If not set, this means don't pass up any scheduling opportunities. Requires Fair Scheduler continuous scheduling to be disabled. If continuous scheduling is enabled, yarn.scheduler.fair.locality-delay-rack-ms should be used instead. yarn.scheduler.fair.locality.threshold.rack resourcemanager_fair_scheduler_locality_threshold_rack false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true
Enable Fair Scheduler Continuous Scheduling Enable continuous scheduling in the Fair Scheduler. When enabled, scheduling decisions are decoupled from NodeManager heartbeats, leading to faster resource allocations. yarn.scheduler.fair.continuous-scheduling-enabled false yarn_scheduler_fair_continuous_scheduling_enabled false
Fair Scheduler Node Locality Delay For applications that request containers on particular nodes, the minimum time in milliseconds the Fair Scheduler waits before accepting a placement on another node. Requires Fair Scheduler continuous scheduling to be enabled. If continuous scheduling is disabled, yarn.scheduler.fair.locality.threshold.node should be used instead. yarn.scheduler.fair.locality-delay-node-ms 2 second(s) yarn_scheduler_fair_locality_delay_node_ms false
Fair Scheduler Rack Locality Delay For applications that request containers on particular racks, the minimum time in milliseconds the Fair Scheduler waits before accepting a placement on another rack. Requires Fair Scheduler continuous scheduling to be enabled. If continuous scheduling is disabled, yarn.scheduler.fair.locality.threshold.rack should be used instead. yarn.scheduler.fair.locality-delay-rack-ms 4 second(s) yarn_scheduler_fair_locality_delay_rack_ms false
Container Memory Increment If using the Fair Scheduler, memory requests will be rounded up to the nearest multiple of this number. This parameter has no effect prior to CDH 5. yarn.scheduler.increment-allocation-mb 512 MiB yarn_scheduler_increment_allocation_mb true
Container Virtual CPU Cores Increment If using the Fair Scheduler, virtual core requests will be rounded up to the nearest multiple of this number. This parameter has no effect prior to CDH 5. yarn.scheduler.increment-allocation-vcores 1 yarn_scheduler_increment_allocation_vcores true
Container Memory Maximum The largest amount of physical memory, in MiB, that can be requested for a container. yarn.scheduler.maximum-allocation-mb 64 GiB yarn_scheduler_maximum_allocation_mb true
Container Virtual CPU Cores Maximum The largest number of virtual CPU cores that can be requested for a container. This parameter has no effect prior to CDH 4.4. yarn.scheduler.maximum-allocation-vcores 32 yarn_scheduler_maximum_allocation_vcores true
Container Memory Minimum The smallest amount of physical memory, in MiB, that can be requested for a container. If using the Capacity or FIFO scheduler (or any scheduler, prior to CDH 5), memory requests will be rounded up to the nearest multiple of this number. yarn.scheduler.minimum-allocation-mb 1 GiB yarn_scheduler_minimum_allocation_mb true
Container Virtual CPU Cores Minimum The smallest number of virtual CPU cores that can be requested for a container. If using the Capacity or FIFO scheduler (or any scheduler, prior to CDH 5), virtual core requests will be rounded up to the nearest multiple of this number. This parameter has no effect prior to CDH 4.4. yarn.scheduler.minimum-allocation-vcores 1 yarn_scheduler_minimum_allocation_vcores true

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_hadoop_metrics2_safety_valve true
Suppress Parameter Validation: ResourceManager Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Java Configuration Options for ResourceManager Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for ResourceManager parameter. false role_config_suppression_resource_manager_java_opts true
Suppress Parameter Validation: ResourceManager Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Log Directory parameter. false role_config_suppression_resource_manager_log_dir true
Suppress Parameter Validation: Capacity Scheduler Configuration Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Capacity Scheduler Configuration Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_resourcemanager_capacity_scheduler_configuration true
Suppress Parameter Validation: ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false role_config_suppression_resourcemanager_config_safety_valve true
Suppress Parameter Validation: Fair Scheduler XML Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Fair Scheduler XML Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_resourcemanager_fair_scheduler_configuration true
Suppress Parameter Validation: ResourceManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Advanced Configuration Snippet (Safety Valve) for mapred-site.xml parameter. false role_config_suppression_resourcemanager_mapred_safety_valve true
Suppress Parameter Validation: ResourceManager Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_resourcemanager_role_env_safety_valve true
Suppress Parameter Validation: ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_allow.txt Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_allow.txt parameter. false role_config_suppression_rm_hosts_allow_safety_valve true
Suppress Parameter Validation: ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_exclude.txt Whether to suppress configuration warnings produced by the built-in parameter validation for the ResourceManager Advanced Configuration Snippet (Safety Valve) for nodes_exclude.txt parameter. false role_config_suppression_rm_hosts_exclude_safety_valve true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Parameter Validation: ApplicationMaster Maximum Attempts Whether to suppress configuration warnings produced by the built-in parameter validation for the ApplicationMaster Maximum Attempts parameter. false role_config_suppression_yarn_resourcemanager_am_max_retries true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_resource_manager_web_metric_collection true

service_wide

Advanced

Display Name Description Related Name Default Value API Name Required
System User's Home Directory The home directory of the system user on the local filesystem. This setting must reflect the system's configured value - only changing it here will not change the actual home directory. /var/lib/hadoop-yarn hdfs_user_home_dir true
HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml For advanced use only, a string to be inserted into mapred-site.xml. Applies to all HDFS Replication jobs. mapreduce_service_replication_config_safety_valve false
System Group The group that this service's processes should run as. (Except the Job History Server, which has its own group.) hadoop process_groupname true
System User The user that this service's processes should run as. (Except the Job History Server, which has its own user) yarn process_username true
YARN Application Classpath Entries to add to the classpaths of YARN applications. yarn.application.classpath $HADOOP_CONF_DIR $HADOOP_COMMON_HOME/* $HADOOP_COMMON_HOME/lib/* $HADOOP_HDFS_HOME/* $HADOOP_HDFS_HOME/lib/* $HADOOP_MAPRED_HOME/* $HADOOP_MAPRED_HOME/lib/* $YARN_HOME/* $YARN_HOME/lib/* yarn_application_classpath false
YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml For advanced use only, a string to be inserted into core-site.xml. Applies to configurations of all roles in this service except client configuration. yarn_core_site_safety_valve false
YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml For advanced use only, a string to be inserted into hadoop-policy.xml. Applies to configurations of all roles in this service except client configuration. yarn_hadoop_policy_config_safety_valve false
YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only, a string to be inserted into yarn-site.xml. Applies to configurations of all roles in this service except client configuration. yarn_service_config_safety_valve false
YARN (MR2 Included) Service Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration. yarn_service_env_safety_valve false
YARN Service MapReduce Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into mapred-site.xml. Applies to configurations of all roles in this service except client configuration. yarn_service_mapred_safety_valve false
HDFS Replication Advanced Configuration Snippet (Safety Valve) for yarn-site.xml For advanced use only, a string to be inserted into yarn-site.xml. Applies to all HDFS Replication jobs. yarn_service_replication_config_safety_valve false

Monitoring

Display Name Description Related Name Default Value API Name Required
Admin Users Applications List Visibility Settings Controls which applications an admin user can see in the applications list view ALL admin_application_list_settings true
Enable Log Event Capture When set, each role identifies important log events and forwards them to Cloudera Manager. true catch_events false
Enable Service Level Health Alerts When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Log Event Retry Frequency The frequency in which the log4j event publication appender will retry sending undelivered log events to the Event server, in seconds 30 log_event_retry_frequency false
Service Triggers The configured triggers for this service. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific service.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the followig JSON formatted trigger fires if there are more than 10 DataNodes with more than 500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:bad", "streamThreshold": 10, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] service_triggers true
Service Monitor Client Config Overrides For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client configuration for the service. <property> <name>mapreduce.output.fileoutputformat.compress</name> <value>false</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.codec</name> <value>org.apache.hadoop.io.compress.DefaultCodec</value> </property> <property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.DeflateCodec, org.apache.hadoop.io.compress.SnappyCodec, org.apache.hadoop.io.compress.Lz4Codec</value> </property> smon_client_config_overrides false
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones. smon_derived_configs_safety_valve false
Non-Admin Users Applications List Visibility Settings Controls which applications non-admin users can see in the applications list view ALL user_application_list_settings true
YARN Application Aggregates Controls the aggregate metrics generated for YARN applications. The structure is a JSON list of the attributes to aggregate and the entities to aggregate to. For example, if the attributeName is 'maps_completed' and the aggregationTargets is ['USER'] then the Service Monitor will create the metric 'yarn_application_maps_completed_rate' and, every ten minutes, will record the total maps completed for each user across all their YARN applications. By default it will also record the number of applications submitted ('apps_submitted_rate') for both users and pool. For a full list of the supported attributes see the YARN search page. Note that the valid aggregation targets are USER, YARN_POOL, and YARN (the service), and that these aggregate metrics can be viewed on both the reports and charts search pages. [ attributeName: maps_total, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: reduces_total, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: cpu_milliseconds, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: mb_millis_maps, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: mb_millis_reduces, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: vcores_millis_maps, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: vcores_millis_reduces, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: file_bytes_read, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: file_bytes_written, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: hdfs_bytes_read, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: hdfs_bytes_written, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: cm_cpu_milliseconds, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] , attributeName: application_duration, aggregationTargets: [USER, YARN_POOL_USER, YARN_POOL, YARN, CLUSTER] ] yarn_application_aggregates false
JobHistory Server Role Health Test When computing the overall YARN health, consider JobHistory Server's health true yarn_jobhistoryserver_health_enabled false
Healthy NodeManager Monitoring Thresholds The health test thresholds of the overall NodeManager health. The check returns "Concerning" health if the percentage of "Healthy" NodeManagers falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" NodeManagers falls below the critical threshold. Warning: 95.0 %, Critical: 90.0 % yarn_nodemanagers_healthy_thresholds false
ResourceManager Role Health Test When computing the overall YARN health, consider ResourceManager's health true yarn_resourcemanager_health_enabled false

Other

Display Name Description Related Name Default Value API Name Required
HDFS Service Name of the HDFS service that this YARN service instance depends on hdfs_service true
Enable ResourceManager ACLs Whether users and groups specified in Admin ACL should be checked for authorization to perform admin operations. yarn.acl.enable true yarn_acl_enable false
Admin ACL ACL that determines which users and groups can submit and kill applications in any pool, and can issue commands on ResourceManager roles. yarn.admin.acl * yarn_admin_acl false
Enable Log Aggregation Whether to enable log aggregation yarn.log-aggregation-enable true yarn_log_aggregation_enable false
Log Aggregation Retention Period How long to keep aggregation logs before deleting them. yarn.log-aggregation.retain-seconds 7 day(s) yarn_log_aggregation_retain_seconds false

Resource Management

Display Name Description Related Name Default Value API Name Required
UNIX User for Nonsecure Mode with Linux Container Executor UNIX user that containers run as when Linux-container-executor is used in nonsecure mode. yarn.nodemanager.linux-container-executor.nonsecure-mode.local-user nobody yarn_nodemanager_linux_container_executor_nonsecure_mode_local_user false
Use CGroups for Resource Management Whether YARN creates a cgroup per container, thereby isolating the CPU usage of containers. When set, yarn.nodemanager.linux-container-executor.resources-handler.class is configured to org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler. The host (in Cloudera Manager) must have cgroups enabled. The number of shares allocated to all YARN containers is configured by adjusting the CPU shares value of the Node Manager in the Resource Management configuration group. yarn.nodemanager.linux-container-executor.resources-handler.class false yarn_service_cgroups false

Security

Display Name Description Related Name Default Value API Name Required
Enable Kerberos Authentication for HTTP Web-Consoles Enables Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Note: This is effective only if Kerberos is enabled for the HDFS service. false hadoop_secure_web_ui false
Kerberos Principal Kerberos principal short name used by all roles of this service. yarn kerberos_princ_name true

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: Gateway Count Validator Whether to suppress configuration warnings produced by the Gateway Count Validator configuration validator. false service_config_suppression_gateway_count_validator true
Suppress Configuration Validator: Secure Web UI Validator Whether to suppress configuration warnings produced by the Secure Web UI Validator configuration validator. false service_config_suppression_hadoop_secure_web_ui true
Suppress Parameter Validation: System User's Home Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the System User's Home Directory parameter. false service_config_suppression_hdfs_user_home_dir true
Suppress Configuration Validator: JobHistory Server Count Validator Whether to suppress configuration warnings produced by the JobHistory Server Count Validator configuration validator. false service_config_suppression_jobhistory_count_validator true
Suppress Parameter Validation: Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Kerberos Principal parameter. false service_config_suppression_kerberos_princ_name true
Suppress Parameter Validation: HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Advanced Configuration Snippet (Safety Valve) for mapred-site.xml parameter. false service_config_suppression_mapreduce_service_replication_config_safety_valve true
Suppress Configuration Validator: NodeManager Count Validator Whether to suppress configuration warnings produced by the NodeManager Count Validator configuration validator. false service_config_suppression_nodemanager_count_validator true
Suppress Parameter Validation: System Group Whether to suppress configuration warnings produced by the built-in parameter validation for the System Group parameter. false service_config_suppression_process_groupname true
Suppress Parameter Validation: System User Whether to suppress configuration warnings produced by the built-in parameter validation for the System User parameter. false service_config_suppression_process_username true
Suppress Configuration Validator: ResourceManager Count Validator Whether to suppress configuration warnings produced by the ResourceManager Count Validator configuration validator. false service_config_suppression_resourcemanager_count_validator true
Suppress Parameter Validation: Service Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Triggers parameter. false service_config_suppression_service_triggers true
Suppress Parameter Validation: Service Monitor Client Config Overrides Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Client Config Overrides parameter. false service_config_suppression_smon_client_config_overrides true
Suppress Parameter Validation: Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) parameter. false service_config_suppression_smon_derived_configs_safety_valve true
Suppress Parameter Validation: Admin ACL Whether to suppress configuration warnings produced by the built-in parameter validation for the Admin ACL parameter. false service_config_suppression_yarn_admin_acl true
Suppress Parameter Validation: YARN Application Aggregates Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Application Aggregates parameter. false service_config_suppression_yarn_application_aggregates true
Suppress Parameter Validation: YARN Application Classpath Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Application Classpath parameter. false service_config_suppression_yarn_application_classpath true
Suppress Parameter Validation: YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Service Advanced Configuration Snippet (Safety Valve) for core-site.xml parameter. false service_config_suppression_yarn_core_site_safety_valve true
Suppress Parameter Validation: YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml parameter. false service_config_suppression_yarn_hadoop_policy_config_safety_valve true
Suppress Parameter Validation: UNIX User for Nonsecure Mode with Linux Container Executor Whether to suppress configuration warnings produced by the built-in parameter validation for the UNIX User for Nonsecure Mode with Linux Container Executor parameter. false service_config_suppression_yarn_nodemanager_linux_container_executor_nonsecure_mode_local_user true
Suppress Parameter Validation: YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Service Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false service_config_suppression_yarn_service_config_safety_valve true
Suppress Parameter Validation: YARN (MR2 Included) Service Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN (MR2 Included) Service Environment Advanced Configuration Snippet (Safety Valve) parameter. false service_config_suppression_yarn_service_env_safety_valve true
Suppress Parameter Validation: YARN Service MapReduce Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN Service MapReduce Advanced Configuration Snippet (Safety Valve) parameter. false service_config_suppression_yarn_service_mapred_safety_valve true
Suppress Parameter Validation: HDFS Replication Advanced Configuration Snippet (Safety Valve) for yarn-site.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Advanced Configuration Snippet (Safety Valve) for yarn-site.xml parameter. false service_config_suppression_yarn_service_replication_config_safety_valve true
Suppress Health Test: JobHistory Server Health Whether to suppress the results of the JobHistory Server Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_yarn_jobhistory_health true
Suppress Health Test: NodeManager Health Whether to suppress the results of the NodeManager Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_yarn_node_managers_healthy true
Suppress Health Test: ResourceManager Health Whether to suppress the results of the ResourceManager Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_yarn_resource_manager_health true