Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
balancer_config_safety_valve
Required
false
Java Configuration Options for Balancer🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Logs🔗
Balancer Log Directory🔗
Description
Directory where Balancer will place its log files.
Related Name
Default Value
/var/log/hadoop-hdfs
API Name
balancer_log_dir
Required
false
Balancer Logging Threshold🔗
Description
The minimum log level for Balancer logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
Balancer Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for Balancer logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
Balancer Max Log Size🔗
Description
The maximum size, in megabytes, per log file for Balancer logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
Monitoring🔗
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
Maximum amount of data to move per node in each iteration of the balancer.
Related Name
dfs.balancer.max-size-to-move
Default Value
10 GiB
API Name
dfs_balancer_max_size_to_move
Required
false
Mover Threads🔗
Description
Thread pool size for executing block moves.
Related Name
dfs.balancer.moverThreads
Default Value
1000
API Name
dfs_balancer_mover_threads
Required
false
Excluded Hosts🔗
Description
Hosts to exclude from the balancing process.
Related Name
Default Value
API Name
rebalancer_exclude_hosts
Required
false
Included Hosts🔗
Description
Hosts to include in the balancing process (uses all, if none specified).
Related Name
Default Value
API Name
rebalancer_include_hosts
Required
false
Source Hosts🔗
Description
Manual override to specify which DataNodes should be used to off-load data to less full nodes.
Related Name
Default Value
API Name
rebalancer_source_hosts
Required
false
Rebalancing Threshold🔗
Description
The percentage deviation from average utilization, after which a node will be rebalanced. (for example, '10.0' for 10%).
Related Name
Default Value
10.0 %
API Name
rebalancer_threshold
Required
false
Rebalancing Policy🔗
Description
The policy that should be used to rebalance HDFS storage. The default DataNode policy balances the storage at the DataNode level. This is similar to the balancing policy from prior releases. The BlockPool policy balances the storage at the block pool level as well as at the DataNode level. The BlockPool policy is relevant only to a Federated HDFS service.
Related Name
Default Value
DataNode
API Name
rebalancing_policy
Required
false
Resource Management🔗
Java Heap Size of Balancer in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Balancer Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log_event_whitelist
Required
true
Suppress Parameter Validation: Excluded Hosts🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Excluded Hosts parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rebalancer_exclude_hosts
Required
true
Suppress Parameter Validation: Included Hosts🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Included Hosts parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rebalancer_include_hosts
Required
true
Suppress Parameter Validation: Source Hosts🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Source Hosts parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rebalancer_source_hosts
Required
true
DataNode🔗
Advanced🔗
DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
datanode_config_safety_valve
Required
false
Java Configuration Options for DataNode🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
Related Name
Default Value
API Name
DATANODE_role_env_safety_valve
Required
false
Available Space Policy Balanced Preference🔗
Description
Only used when the DataNode Volume Choosing Policy is set to Available Space. Controls what percentage of new block allocations will be sent to volumes with more available disk space than others. This setting should be in the range 0.0 - 1.0, though in practice 0.5 - 1.0, since there should be no reason to prefer that volumes with less available disk space receive more block allocations.
Only used when the DataNode Volume Choosing Policy is set to Available Space. Controls how much DataNode volumes are allowed to differ in terms of bytes of free disk space before they are considered imbalanced. If the free space of all the volumes are within this range of each other, the volumes will be considered balanced and block assignments will be done on a pure round robin basis.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
true
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
Erasure Coding🔗
DataNode Striped Read Reconstruction Threads🔗
Description
The number of threads that a DataNode can use during background data reconstruction.
Related Name
dfs.datanode.ec.reconstruction.threads
Default Value
20
API Name
erasure_coding_reconstruction_threads
Required
false
DataNode Striped Read Reconstruction Timeout🔗
Description
The timeout for striped reads during background data reconstruction.
The relative weight of resources used by EC for data recovery. The number of blocks that must be read is based on the EC policy used. For example, RS-6-3-1024k requires six blocks to be read. Replication only requires one block to be read. Higher values result in fewer reconstruction tasks being able to run concurrently. The number of blocks required to be read to recover data is multiplied by this weight to determine the total weight of the recovery task. The total weight of the recovery task counts against the limit set with the dfs.namenode.replication.max-streams property.
Related Name
dfs.datanode.ec.reconstruction.xmits.weight
Default Value
0.5
API Name
erasure_coding_reconstruction_xmits_weight
Required
false
Logs🔗
DataNode Log Directory🔗
Description
Directory where DataNode will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
datanode_log_dir
Required
false
DataNode Logging Threshold🔗
Description
The minimum log level for DataNode logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
DataNode Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for DataNode logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
DataNode Max Log Size🔗
Description
The maximum size, in megabytes, per log file for DataNode logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
Monitoring🔗
DataNode Block Count Thresholds🔗
Description
The health test thresholds of the number of blocks on a DataNode
Related Name
Default Value
Warning: 1000000.0, Critical: Never
API Name
datanode_block_count_thresholds
Required
false
DataNode Connectivity Health Test🔗
Description
Enables the health test that verifies the DataNode is connected to the NameNode
Related Name
Default Value
true
API Name
datanode_connectivity_health_enabled
Required
false
DataNode Connectivity Tolerance at Startup🔗
Description
The amount of time to wait for the DataNode to fully start up and connect to the NameNode before enforcing the connectivity check.
Related Name
Default Value
3 minute(s)
API Name
datanode_connectivity_tolerance
Required
false
DataNode Data Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's DataNode Data Directory.
DataNode Data Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's DataNode Data Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a DataNode Data Directory Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
datanode_fd_thresholds
Required
false
DataNode Free Space Monitoring Thresholds🔗
Description
The health test thresholds of free space in a DataNode. Specified as a percentage of the capacity on the DataNode.
Related Name
Default Value
Warning: 20.0 %, Critical: 10.0 %
API Name
datanode_free_space_thresholds
Required
false
DataNode Host Health Test🔗
Description
When computing the overall DataNode health, consider the host's health.
Related Name
Default Value
true
API Name
datanode_host_health_enabled
Required
false
Pause Duration Thresholds🔗
Description
The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time.
Related Name
Default Value
Warning: 30.0, Critical: 60.0
API Name
datanode_pause_duration_thresholds
Required
false
Pause Duration Monitoring Period🔗
Description
The period to review when computing the moving average of extra time the pause monitor spent paused.
Related Name
Default Value
5 minute(s)
API Name
datanode_pause_duration_window
Required
false
DataNode Process Health Test🔗
Description
Enables the health test that the DataNode's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
datanode_scm_health_enabled
Required
false
DataNode Transceivers Usage Thresholds🔗
Description
The health test thresholds of transceivers usage in a DataNode. Specified as a percentage of the total configured number of transceivers.
Related Name
Default Value
Warning: 75.0 %, Critical: 95.0 %
API Name
datanode_transceivers_usage_thresholds
Required
false
DataNode Volume Failures Thresholds🔗
Description
The health test thresholds of failed volumes in a DataNode.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
datanode_volume_failures_thresholds
Required
false
Web Metric Collection🔗
Description
Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server.
Related Name
Default Value
true
API Name
datanode_web_metric_collection_enabled
Required
false
Web Metric Collection Duration🔗
Description
The health test thresholds on the duration of the metrics request to the web server.
Related Name
Default Value
Warning: 10 second(s), Critical: Never
API Name
datanode_web_metric_collection_thresholds
Required
false
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
false
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
DataNode Data Directory🔗
Description
Comma-delimited list of directories on the local file system where the DataNode stores HDFS block data. Typical values are /data/N/dfs/dn for N = 1, 2, 3.... In CDH 5.7 and higher, these directories can be optionally tagged with their storage types, for example, [SSD]/data/1/dns/dn. HDFS supports the following storage types: [DISK], [SSD], [ARCHIVE], [RAM_DISK]. The default storage type of a directory will be [DISK] if it does not have a storage type tagged explicitly. These directories should be mounted using the noatime option, and the disks should be configured using JBOD. RAID is not recommended. Warning: Be very careful when modifying this property. Removing or changing entries can result in data loss. To hot swap drives in CDH 5.4 and higher, override the value of this property for the specific DataNode role instance that has the drive to be hot-swapped; do not modify the property value in the role group. See Configuring Hot Swap for DataNodes for more information.
Related Name
dfs.datanode.data.dir
Default Value
API Name
dfs_data_dir_list
Required
true
Reserved Space for Non DFS Use🔗
Description
Reserved space in bytes per volume for non Distributed File System (DFS) use.
Related Name
dfs.datanode.du.reserved
Default Value
10 GiB
API Name
dfs_datanode_du_reserved
Required
false
DataNode Failed Volumes Tolerated🔗
Description
The number of volumes that are allowed to fail before a DataNode stops offering service. By default, any volume failure will cause a DataNode to shutdown.
Related Name
dfs.datanode.failed.volumes.tolerated
Default Value
0
API Name
dfs_datanode_failed_volumes_tolerated
Required
false
Performance🔗
DataNode Balancing Bandwidth🔗
Description
Maximum amount of bandwidth that each DataNode can use for balancing. Specified in bytes per second.
Related Name
dfs.datanode.balance.bandwidthPerSec
Default Value
10 MiB
API Name
dfs_balance_bandwidthPerSec
Required
false
Enable purging cache after reads🔗
Description
In some workloads, the data read from HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is delivered to the client. This may improve performance for some workloads by freeing buffer cache spare usage for more cacheable data. This behavior will always be disabled for workloads that read only short sections of a block (e.g HBase random-IO workloads). This property is supported in CDH3u3 or later deployments.
Related Name
dfs.datanode.drop.cache.behind.reads
Default Value
false
API Name
dfs_datanode_drop_cache_behind_reads
Required
false
Enable purging cache after writes🔗
Description
In some workloads, the data written to HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is written to disk. This may improve performance for some workloads by freeing buffer cache spare usage for more cacheable data. This property is supported in CDH3u3 or later deployments.
Related Name
dfs.datanode.drop.cache.behind.writes
Default Value
false
API Name
dfs_datanode_drop_cache_behind_writes
Required
false
Handler Count🔗
Description
The number of server threads for the DataNode.
Related Name
dfs.datanode.handler.count
Default Value
3
API Name
dfs_datanode_handler_count
Required
false
Maximum Number of Transfer Threads🔗
Description
Specifies the maximum number of threads to use for transferring data in and out of the DataNode.
Related Name
dfs.datanode.max.transfer.threads
Default Value
4096
API Name
dfs_datanode_max_xcievers
Required
false
Number of read ahead bytes🔗
Description
While reading block files, the DataNode can use the posix_fadvise system call to explicitly page data into the operating system buffer cache ahead of the current reader's position. This can improve performance especially when disks are highly contended. This configuration specifies the number of bytes ahead of the current read position which the DataNode will attempt to read ahead. A value of 0 disables this feature. This property is supported in CDH3u3 or later deployments.
Related Name
dfs.datanode.readahead.bytes
Default Value
4 MiB
API Name
dfs_datanode_readahead_bytes
Required
false
Enable immediate enqueuing of data to disk after writes🔗
Description
If this configuration is enabled, the DataNode will instruct the operating system to enqueue all written data to the disk immediately after it is written. This differs from the usual OS policy which may wait for up to 30 seconds before triggering writeback. This may improve performance for some workloads by smoothing the IO profile for data written to disk. This property is supported in CDH3u3 or later deployments.
Related Name
dfs.datanode.sync.behind.writes
Default Value
false
API Name
dfs_datanode_sync_behind_writes
Required
false
HDFS Thrift Server Max Threadcount🔗
Description
Maximum number of running threads for the HDFS Thrift server running on each DataNode
Related Name
dfs.thrift.threads.max
Default Value
20
API Name
dfs_thrift_threads_max
Required
false
HDFS Thrift Server Min Threadcount🔗
Description
Minimum number of running threads for the HDFS Thrift server running on each DataNode
Related Name
dfs.thrift.threads.min
Default Value
10
API Name
dfs_thrift_threads_min
Required
false
HDFS Thrift Server Timeout🔗
Description
Timeout in seconds for the HDFS Thrift server running on each DataNode
Related Name
dfs.thrift.timeout
Default Value
60
API Name
dfs_thrift_timeout
Required
false
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
Bind DataNode to Wildcard Address🔗
Description
If enabled, the DataNode binds to the wildcard address ("0.0.0.0") on all of its ports.
Related Name
Default Value
false
API Name
dfs_datanode_bind_wildcard
Required
false
DataNode HTTP Web UI Port🔗
Description
Port for the DataNode HTTP web UI. Combined with the DataNode's hostname to build its HTTP address.
Related Name
dfs.datanode.http.address
Default Value
9864
API Name
dfs_datanode_http_port
Required
false
Secure DataNode Web UI Port (TLS/SSL)🔗
Description
The base port where the secure DataNode web UI listens. Combined with the DataNode's hostname to build its secure web UI address.
Related Name
dfs.datanode.https.address
Default Value
9865
API Name
dfs_datanode_https_port
Required
false
DataNode Protocol Port🔗
Description
Port for the various DataNode Protocols. Combined with the DataNode's hostname to build its IPC port address.
Related Name
dfs.datanode.ipc.address
Default Value
9867
API Name
dfs_datanode_ipc_port
Required
false
DataNode Transceiver Port🔗
Description
Port for DataNode's XCeiver Protocol. Combined with the DataNode's hostname to build its address.
Related Name
dfs.datanode.address
Default Value
9866
API Name
dfs_datanode_port
Required
false
Use DataNode Hostname🔗
Description
Whether DataNodes should use DataNode hostnames when connecting to DataNodes for data transfer. This property is supported in CDH3u4 or later deployments.
Related Name
dfs.datanode.use.datanode.hostname
Default Value
false
API Name
dfs_datanode_use_datanode_hostname
Required
false
Resource Management🔗
Java Heap Size of DataNode in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
4 GiB
API Name
datanode_java_heapsize
Required
false
Maximum Memory Used for Caching🔗
Description
The maximum amount of memory a DataNode may use to cache data blocks in memory. Setting it to zero will disable caching.
Related Name
dfs.datanode.max.locked.memory
Default Value
4 GiB
API Name
dfs_datanode_max_locked_memory
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Security🔗
DataNode Data Directory Permissions🔗
Description
Permissions for the directories on the local file system where the DataNode stores its blocks. The permissions must be octal. 755 and 700 are typical values.
Related Name
dfs.datanode.data.dir.perm
Default Value
700
API Name
dfs_datanode_data_dir_perm
Required
false
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the DataNode Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the DataNode Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_audit_health
Required
true
Suppress Health Test: Block Count🔗
Description
Whether to suppress the results of the Block Count heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_block_count
Required
true
Suppress Health Test: File Descriptors🔗
Description
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_file_descriptor
Required
true
Suppress Health Test: Free Space🔗
Description
Whether to suppress the results of the Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the NameNode Connectivity heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_ha_connectivity
Required
true
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_host_health
Required
true
Suppress Health Test: Log Directory Free Space🔗
Description
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_pause_duration
Required
true
Suppress Health Test: Process Status🔗
Description
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_scm_health
Required
true
Suppress Health Test: Swap Memory Usage🔗
Description
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Transceiver Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Data Directory Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_data_node_volume_failures
Required
true
Suppress Health Test: Web Server Status🔗
Description
Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: DataNode Data Directory Free Space🔗
Description
Whether to suppress the results of the DataNode Data Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Java Configuration Options for Failover Controller🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
Related Name
Default Value
API Name
FAILOVERCONTROLLER_role_env_safety_valve
Required
false
Failover Controller Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
false
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
Logs🔗
Failover Controller Log Directory🔗
Description
Directory where Failover Controller will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
failover_controller_log_dir
Required
false
Failover Controller Logging Threshold🔗
Description
The minimum log level for Failover Controller logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
Failover Controller Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for Failover Controller logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
Failover Controller Max Log Size🔗
Description
The maximum size, in megabytes, per log file for Failover Controller logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
File Descriptor Monitoring Thresholds🔗
Description
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
failovercontroller_fd_thresholds
Required
false
Failover Controller Host Health Test🔗
Description
When computing the overall Failover Controller health, consider the host's health.
Related Name
Default Value
true
API Name
failovercontroller_host_health_enabled
Required
false
Failover Controller Process Health Test🔗
Description
Enables the health test that the Failover Controller's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
failovercontroller_scm_health_enabled
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Performance🔗
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Resource Management🔗
Java Heap Size of Failover Controller in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
256 MiB
API Name
failover_controller_java_heapsize
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Related Name
Default Value
false
API Name
role_config_suppression_cdh_version_validator
Required
true
Suppress Parameter Validation: Java Configuration Options for Failover Controller🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Failover Controller parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Failover Controller Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Failover Controller Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Failover Controller Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Logs🔗
Gateway Logging Threshold🔗
Description
The minimum log level for Gateway logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
Monitoring🔗
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Other🔗
Alternatives Priority🔗
Description
The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will cause Alternatives to prefer this configuration over any others.
Related Name
Default Value
90
API Name
client_config_priority
Required
true
Use Trash🔗
Description
Move deleted files to the trash so that they can be recovered if necessary. This client side configuration takes effect only if the HDFS service-wide trash is disabled (NameNode Filesystem Trash Interval set to 0) and is ignored otherwise. The trash is not automatically emptied when enabled with this configuration.
Related Name
Default Value
false
API Name
dfs_client_use_trash
Required
false
Performance🔗
Enable HDFS Short-Circuit Read🔗
Description
Enable HDFS short-circuit read. This allows a client colocated with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality.
Related Name
dfs.client.read.shortcircuit
Default Value
true
API Name
dfs_client_read_shortcircuit
Required
false
Resource Management🔗
Client Java Heap Size in Bytes🔗
Description
Maximum size in bytes for the Java process heap memory. Passed to Java -Xmx.
Related Name
Default Value
256 MiB
API Name
hdfs_client_java_heapsize
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Related Name
Default Value
false
API Name
role_config_suppression_cdh_version_validator
Required
true
Suppress Parameter Validation: Deploy Directory🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Deploy Directory parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Client Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Gateway Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
HttpFS🔗
Advanced🔗
HttpFS Advanced Configuration Snippet (Safety Valve) for httpfs-site.xml🔗
Description
For advanced use only. A string to be inserted into httpfs-site.xml for this role only.
Related Name
Default Value
API Name
httpfs_config_safety_valve
Required
false
HttpFS Advanced Configuration Snippet (Safety Valve) for core-site.xml🔗
Description
For advanced use only. A string to be inserted into core-site.xml for this role only.
Related Name
Default Value
API Name
httpfs_core_site_safety_valve
Required
false
HttpFS Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
httpfs_hdfs_site_safety_valve
Required
false
Java Configuration Options for HttpFS🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
Related Name
Default Value
API Name
httpfs_java_opts
Required
false
System Group🔗
Description
The group that the HttpFS server process should run as.
Related Name
Default Value
httpfs
API Name
httpfs_process_groupname
Required
true
System User🔗
Description
The user that the HttpFS server process should run as.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
false
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
Logs🔗
HttpFS Log Directory🔗
Description
Directory where HttpFS will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-httpfs
API Name
httpfs_log_dir
Required
false
HttpFS Logging Threshold🔗
Description
The minimum log level for HttpFS logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
HttpFS Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for HttpFS logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
HttpFS Max Log Size🔗
Description
The maximum size, in megabytes, per log file for HttpFS logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
httpfs_fd_thresholds
Required
false
HttpFS Host Health Test🔗
Description
When computing the overall HttpFS health, consider the host's health.
Related Name
Default Value
true
API Name
httpfs_host_health_enabled
Required
false
HttpFS Process Health Test🔗
Description
Enables the health test that the HttpFS's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
httpfs_scm_health_enabled
Required
false
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Navigator Audit Failure Thresholds🔗
Description
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
HttpFS Load Balancer🔗
Description
Address of the load balancer used for HttpFS roles. Should be specified in host:port format. Note: Changing this property will regenerate Kerberos keytabs for all HttpFS roles.
Related Name
Default Value
API Name
httpfs_load_balancer
Required
false
Performance🔗
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
Administration Port🔗
Description
The port for the administration interface.
Related Name
hdfs.httpfs.admin.port
Default Value
14001
API Name
hdfs_httpfs_admin_port
Required
false
REST Port🔗
Description
The port where the REST interface to HDFS is available. The REST interface is served over HTTPS if TLS/SSL is enabled for HttpFS, or over HTTP otherwise.
Related Name
hdfs.httpfs.http.port
Default Value
14000
API Name
hdfs_httpfs_http_port
Required
false
Resource Management🔗
Java Heap Size of HttpFS in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
256 MiB
API Name
httpfs_java_heapsize
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Security🔗
Signature Secret🔗
Description
The secret to use for signing client authentication tokens.
Related Name
hdfs.httpfs.signature.secret
Default Value
******
API Name
hdfs_httpfs_signature_secret
Required
true
HttpFS TLS/SSL Server Keystore File Location🔗
Description
The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when HttpFS is acting as a TLS/SSL server. The keystore must be in the format specified in Administration > Settings > Java Keystore Type.
Related Name
Default Value
API Name
httpfs_https_keystore_file
Required
false
HttpFS TLS/SSL Server Keystore File Password🔗
Description
The password for the HttpFS keystore file.
Related Name
Default Value
API Name
httpfs_https_keystore_password
Required
false
HttpFS TLS/SSL Trust Store File🔗
Description
The location on disk of the trust store, in .jks format, used to confirm the authenticity of TLS/SSL servers that HttpFS might connect to. This trust store must contain the certificate(s) used to sign the service(s) connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead.
Related Name
Default Value
API Name
httpfs_https_truststore_file
Required
false
HttpFS TLS/SSL Trust Store Password🔗
Description
The password for the HttpFS TLS/SSL Trust Store File. This password is not required to access the trust store; this field can be left blank. This password provides optional integrity checking of the file. The contents of trust stores are certificates, and certificates are public information.
Related Name
Default Value
API Name
httpfs_https_truststore_password
Required
false
Enable TLS/SSL for HttpFS🔗
Description
Encrypt communication between clients and HttpFS using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)).
Related Name
Default Value
false
API Name
httpfs_use_ssl
Required
false
Role-Specific Kerberos Principal🔗
Description
Kerberos principal used by the HttpFS roles.
Related Name
Default Value
httpfs
API Name
kerberos_role_princ_name
Required
true
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS Advanced Configuration Snippet (Safety Valve) for httpfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS Advanced Configuration Snippet (Safety Valve) for core-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Suppress Parameter Validation: HttpFS TLS/SSL Server Keystore File Location🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS TLS/SSL Server Keystore File Location parameter.
Suppress Parameter Validation: HttpFS TLS/SSL Server Keystore File Password🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS TLS/SSL Server Keystore File Password parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HttpFS Logging Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_audit_health
Required
true
Suppress Health Test: File Descriptors🔗
Description
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_file_descriptor
Required
true
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_host_health
Required
true
Suppress Health Test: Log Directory Free Space🔗
Description
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_scm_health
Required
true
Suppress Health Test: Swap Memory Usage🔗
Description
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_swap_memory_usage
Required
true
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_httpfs_unexpected_exits
Required
true
JournalNode🔗
Advanced🔗
Enable JournalNode Syncer🔗
Description
When enabled, a JournalNode will periodically sync edit logs with other JournalNodes.
Related Name
dfs.journalnode.enable.sync
Default Value
true
API Name
dfs_journalnode_enable_sync
Required
false
JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
jn_config_safety_valve
Required
false
Java Configuration Options for JournalNode🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
true
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
Logs🔗
JournalNode Log Directory🔗
Description
Directory where JournalNode will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
journalnode_log_dir
Required
false
JournalNode Logging Threshold🔗
Description
The minimum log level for JournalNode logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
JournalNode Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for JournalNode logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
JournalNode Max Log Size🔗
Description
The maximum size, in megabytes, per log file for JournalNode logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
JournalNode Edits Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's JournalNode Edits Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a JournalNode Edits Directory Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
journalnode_fd_thresholds
Required
false
JournalNode Fsync Latency Thresholds🔗
Description
The health test thresholds for JournalNode fsync latency.
Related Name
Default Value
Warning: 1 second(s), Critical: 3 second(s)
API Name
journalnode_fsync_latency_thresholds
Required
false
Garbage Collection Duration Thresholds🔗
Description
The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time.
Related Name
Default Value
Warning: 30.0, Critical: 60.0
API Name
journalnode_gc_duration_thresholds
Required
false
Garbage Collection Duration Monitoring Period🔗
Description
The period to review when computing the moving average of garbage collection time.
Related Name
Default Value
5 minute(s)
API Name
journalnode_gc_duration_window
Required
false
JournalNode Host Health Test🔗
Description
When computing the overall JournalNode health, consider the host's health.
Related Name
Default Value
true
API Name
journalnode_host_health_enabled
Required
false
JournalNode Process Health Test🔗
Description
Enables the health test that the JournalNode's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
journalnode_scm_health_enabled
Required
false
Active NameNode Sync Status Health Check🔗
Description
Enables the health check that verifies the active NameNode's sync status to the JournalNode
Related Name
Default Value
true
API Name
journalnode_sync_status_enabled
Required
false
Active NameNode Sync Status Startup Tolerance🔗
Description
The amount of time at JournalNode startup allowed for the active NameNode to get in sync with the JournalNode.
Related Name
Default Value
3 minute(s)
API Name
journalnode_sync_status_startup_tolerance
Required
false
Web Metric Collection🔗
Description
Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server.
Related Name
Default Value
true
API Name
journalnode_web_metric_collection_enabled
Required
false
Web Metric Collection Duration🔗
Description
The health test thresholds on the duration of the metrics request to the web server.
Related Name
Default Value
Warning: 10 second(s), Critical: Never
API Name
journalnode_web_metric_collection_thresholds
Required
false
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
JournalNode Edits Directory🔗
Description
Directory on the local file system where NameNode edits are written.
Related Name
dfs.journalnode.edits.dir
Default Value
API Name
dfs_journalnode_edits_dir
Required
true
Performance🔗
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
JournalNode HTTP Port🔗
Description
Port for the JournalNode HTTP web UI. Combined with the JournalNode hostname to build its HTTP address.
Related Name
dfs.journalnode.http-address
Default Value
8480
API Name
dfs_journalnode_http_port
Required
false
Secure JournalNode Web UI Port (TLS/SSL)🔗
Description
The base port where the secure JournalNode web UI listens. Combined with the JournalNode's hostname to build its secure web UI address.
Related Name
dfs.journalnode.https-address
Default Value
8481
API Name
dfs_journalnode_https_port
Required
false
JournalNode RPC Port🔗
Description
Port for the JournalNode's RPC. Combined with the JournalNode's hostname to build its RPC address.
Related Name
dfs.journalnode.rpc-address
Default Value
8485
API Name
dfs_journalnode_rpc_port
Required
false
Bind JournalNode to Wildcard Address🔗
Description
If enabled, the JournalNode binds to the wildcard address ("0.0.0.0") on all of its ports.
Related Name
Default Value
false
API Name
journalnode_bind_wildcard
Required
false
Resource Management🔗
Java Heap Size of JournalNode in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
512 MiB
API Name
journalNode_java_heapsize
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Related Name
Default Value
false
API Name
role_config_suppression_jn_config_safety_valve
Required
true
Suppress Parameter Validation: Java Configuration Options for JournalNode🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for JournalNode parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the JournalNode Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the JournalNode Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_journal_node_audit_health
Required
true
Suppress Health Test: JournalNode Edits Directory Free Space🔗
Description
Whether to suppress the results of the JournalNode Edits Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Fsync Latency heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_journal_node_gc_duration
Required
true
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_journal_node_host_health
Required
true
Suppress Health Test: Log Directory Free Space🔗
Description
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_journal_node_scm_health
Required
true
Suppress Health Test: Swap Memory Usage🔗
Description
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Sync Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_journal_node_sync_status
Required
true
Suppress Health Test: Unexpected Exits🔗
Description
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Enable Automatic Failover to maintain High Availability. Requires a ZooKeeper service and a High Availability NameNode partner.
Related Name
dfs.ha.automatic-failover.enabled
Default Value
false
API Name
autofailover_enabled
Required
false
NameNode Nameservice🔗
Description
Nameservice of this NameNode. The Nameservice represents the interface to this NameNode and its High Availability partner. The Nameservice also represents the namespace associated with a federated NameNode.
Related Name
Default Value
API Name
dfs_federation_namenode_nameservice
Required
false
Enable Async Audit Log🔗
Description
When enabled, HDFS NameNode will append audit log asynchronously when using HDFS default audit logger. Enabling this should improve NameNode throughput under heavy load.
Related Name
dfs.namenode.audit.log.async
Default Value
true
API Name
dfs_namenode_audit_log_async
Required
false
Avoid Reading Stale DataNode🔗
Description
Indicate whether or not to avoid reading from stale DataNodes for which heartbeat messages have not been received by the NameNode for more than Stale DataNode Time Interval. Stale DataNodes are moved to the end of the node list returned for reading. See dfs.namenode.avoid.write.stale.datanode for a similar setting for writes.
Related Name
dfs.namenode.avoid.read.stale.datanode
Default Value
true
API Name
dfs_namenode_avoid_read_stale_datanode
Required
false
Avoid Writing Stale DataNode🔗
Description
Indicate whether or not to avoid writing to stale DataNodes for which heartbeat messages have not been received by the NameNode for more than Stale DataNode Time Interval. Writes avoid using stale DataNodes unless more than a configured ratio (dfs.namenode.write.stale.datanode.ratio) of DataNodes are marked as stale. See dfs.namenode.avoid.read.stale.datanode for a similar setting for reads.
Related Name
dfs.namenode.avoid.write.stale.datanode
Default Value
true
API Name
dfs_namenode_avoid_write_stale_datanode
Required
false
Invalidate Work Percentage Per Iteration🔗
Description
This determines the percentage amount of block invalidations (deletes) to do over a single DataNode heartbeat deletion command. The final deletion count is determined by applying this percentage to the number of live nodes in the system. The resultant number is the number of blocks from the deletion list chosen for proper invalidation over a single heartbeat of a single DataNode.
Related Name
dfs.namenode.invalidate.work.pct.per.iteration
Default Value
0.32
API Name
dfs_namenode_invalidate_work_pct_per_iteration
Required
false
Quorum-based Storage Journal name🔗
Description
Name of the journal located on each JournalNode filesystem.
Related Name
Default Value
API Name
dfs_namenode_quorum_journal_name
Required
false
Maximum Number of Replication Threads on a DataNode🔗
Description
The maximum number of outgoing replication threads a node can have at one time. This limit is waived for the highest priority replications. Configure dfs.namenode.replication.max-streams-hard-limit to set the absolute limit, including the highest-priority replications.
Related Name
dfs.namenode.replication.max-streams
Default Value
20
API Name
dfs_namenode_replication_max_streams
Required
false
Hard Limit on the Number of Replication Threads on a Datanode🔗
Description
The absolute maximum number of outgoing replication threads a given node can have at one time. The regular limit (dfs.namenode.replication.max-streams) is waived for highest-priority block replications. Highest replication priority is for blocks that are at a very high risk of loss if the disk or server on which they remain fails. These are usually blocks with only one copy, or blocks with zero live copies but a copy in a node being decommissioned. dfs.namenode.replication.max-streams-hard-limit provides a limit on the total number of outgoing replication threads, including threads of all priorities.
Related Name
dfs.namenode.replication.max-streams-hard-limit
Default Value
40
API Name
dfs_namenode_replication_max_streams_hard_limit
Required
false
Replication Work Multiplier Per Iteration🔗
Description
This determines the total amount of block transfers to begin in parallel at a DataNode for replication, when such a command list is being sent over a DataNode heartbeat by the NameNode. The actual number is obtained by multiplying this value by the total number of live nodes in the cluster. The result number is the number of blocks to transfer immediately, per DataNode heartbeat.
When enabled, HDFS snapshots will capture point-in-time copies of open files.
Related Name
dfs.namenode.snapshot.capture.openfiles
Default Value
true
API Name
dfs_namenode_snapshot_capture_openfiles
Required
false
Stale DataNode Time Interval🔗
Description
Default time interval for marking a DataNode as "stale". If the NameNode has not received heartbeat messages from a DataNode for more than this time interval, the DataNode is marked and treated as "stale" by default.
Related Name
dfs.namenode.stale.datanode.interval
Default Value
30 second(s)
API Name
dfs_namenode_stale_datanode_interval
Required
false
Write Stale DataNode Ratio🔗
Description
When the ratio of number stale DataNodes to total DataNodes marked is greater than this ratio, permit writing to stale nodes to prevent causing hotspots.
Related Name
dfs.namenode.write.stale.datanode.ratio
Default Value
0.5
API Name
dfs_namenode_write_stale_datanode_ratio
Required
false
JournalNode Accept Recovery Timeout🔗
Description
Timeout when accepting recovery of an edit segment from JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.accept-recovery.timeout.ms
Default Value
2 minute(s)
API Name
dfs_qjournal_accept_recovery_timeout_ms
Required
false
JournalNode Finalize Segment Timeout🔗
Description
Timeout when finalizing current edit segment with JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.finalize-segment.timeout.ms
Default Value
2 minute(s)
API Name
dfs_qjournal_finalize_segment_timeout_ms
Required
false
JournalNode Get State Timeout🔗
Description
Timeout when getting current states from JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.get-journal-state.timeout.ms
Default Value
2 minute(s)
API Name
dfs_qjournal_get_journal_state_timeout_ms
Required
false
JournalNode New Epoch Timeout🔗
Description
Timeout when creating new epoch number with JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.new-epoch.timeout.ms
Default Value
2 minute(s)
API Name
dfs_qjournal_new_epoch_timeout_ms
Required
false
JournalNode Prepare Recovery Timeout🔗
Description
Timeout when preparing recovery of an edit segment with JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.prepare-recovery.timeout.ms
Default Value
2 minute(s)
API Name
dfs_qjournal_prepare_recovery_timeout_ms
Required
false
JournalNode Select Input Streams Timeout🔗
Description
Timeout when selecting input streams on JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.select-input-streams.timeout.ms
Default Value
20 second(s)
API Name
dfs_qjournal_select_input_streams_timeout_ms
Required
false
JournalNode Start Segment Timeout🔗
Description
Timeout when starting a new edit segment with JournalNodes. This only applies when NameNode high availability is enabled.
Related Name
dfs.qjournal.start-segment.timeout.ms
Default Value
20 second(s)
API Name
dfs_qjournal_start_segment_timeout_ms
Required
false
JournalNode Write Transactions Timeout🔗
Description
Timeout when writing edits to a JournalNode. This only applies when NameNode high availability is enabled.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
NameNode Advanced Configuration Snippet (Safety Valve) for dfs_all_hosts.txt🔗
Description
For advanced use only. A string to be inserted into dfs_all_hosts.txt for this role only.
Related Name
Default Value
API Name
namenode_all_hosts_safety_valve
Required
false
NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
namenode_config_safety_valve
Required
false
NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_allow.txt🔗
Description
For advanced use only. A string to be inserted into dfs_hosts_allow.txt for this role only.
Related Name
Default Value
API Name
namenode_hosts_allow_safety_valve
Required
false
NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_exclude.txt🔗
Description
For advanced use only. A string to be inserted into dfs_hosts_exclude.txt for this role only.
Related Name
Default Value
API Name
namenode_hosts_exclude_safety_valve
Required
false
Java Configuration Options for NameNode🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
Related Name
Default Value
API Name
NAMENODE_role_env_safety_valve
Required
false
Mount Points🔗
Description
Mount points that are mapped to this NameNode's nameservice.
Related Name
Default Value
/
API Name
nameservice_mountpoints
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
false
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
NameNode Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml🔗
Description
For advanced use only. A string to be inserted into ranger-hdfs-security.xml for this role only.
Related Name
Default Value
API Name
ranger_security_role_safety_valve
Required
false
Checkpointing🔗
Filesystem Checkpoint Period🔗
Description
The time between two periodic file system checkpoints.
Related Name
dfs.namenode.checkpoint.period
Default Value
1 hour(s)
API Name
fs_checkpoint_period
Required
false
Filesystem Checkpoint Transaction Threshold🔗
Description
The number of transactions after which the NameNode or SecondaryNameNode will create a checkpoint of the namespace, regardless of whether the checkpoint period has expired.
Related Name
dfs.namenode.checkpoint.txns
Default Value
1000000
API Name
fs_checkpoint_txns
Required
false
Erasure Coding🔗
Fallback Erasure Coding Policy🔗
Description
The fallback Erasure Coding policy that HDFS uses if no policy is specified when you run the -setPolicy command.
Related Name
dfs.namenode.ec.system.default.policy
Default Value
RS-6-3-1024k
API Name
erasure_coding_default_policy
Required
false
Logs🔗
NameNode Logging Threshold🔗
Description
The minimum log level for NameNode logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
NameNode Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for NameNode logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
NameNode Max Log Size🔗
Description
The maximum size, in megabytes, per log file for NameNode logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
NameNode Block State Change Logging Threshold🔗
Description
The minimum log level for NameNode block state change log messages. Setting this to WARN or higher greatly reduces the amount of log output related to block state changes.
Related Name
log4j.logger.BlockStateChange
Default Value
INFO
API Name
namenode_blockstatechange_log_threshold
Required
false
NameNode Log Directory🔗
Description
Directory where NameNode will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
namenode_log_dir
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Filesystem Checkpoint Age Monitoring Thresholds🔗
Description
The health test thresholds of the age of the HDFS namespace checkpoint. Specified as a percentage of the configured checkpoint interval.
The health test thresholds of the number of transactions since the last HDFS namespace checkpoint. Specified as a percentage of the configured checkpointing transaction limit.
Related Name
Default Value
Warning: 200.0 %, Critical: 400.0 %
API Name
namenode_checkpoint_transactions_thresholds
Required
false
NameNode Data Directories Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's NameNode Data Directories.
NameNode Data Directories Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's NameNode Data Directories. Specified as a percentage of the capacity on that filesystem. This setting is not used if a NameNode Data Directories Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds of failed status directories in a NameNode.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
namenode_directory_failures_thresholds
Required
false
File Descriptor Monitoring Thresholds🔗
Description
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
namenode_fd_thresholds
Required
false
NameNode Host Health Test🔗
Description
When computing the overall NameNode health, consider the host's health.
Related Name
Default Value
true
API Name
namenode_host_health_enabled
Required
false
NameNode Out-Of-Sync JournalNodes Thresholds🔗
Description
The health check thresholds for the number of out-of-sync JournalNodes for this NameNode.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
namenode_out_of_sync_journal_nodes_thresholds
Required
false
Pause Duration Thresholds🔗
Description
The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time.
Related Name
Default Value
Warning: 30.0, Critical: 60.0
API Name
namenode_pause_duration_thresholds
Required
false
Pause Duration Monitoring Period🔗
Description
The period to review when computing the moving average of extra time the pause monitor spent paused.
Related Name
Default Value
5 minute(s)
API Name
namenode_pause_duration_window
Required
false
HDFS Rolling Metadata Upgrade Status Health Test🔗
Description
Enables the health test of the rolling metadata upgrade status of the NameNode. This covers rolling metadata upgrades. Nonrolling metadata upgrades are covered in a separate health test.
Related Name
Default Value
true
API Name
namenode_rolling_upgrade_status_enabled
Required
false
NameNode RPC Latency Thresholds🔗
Description
The health check thresholds of the NameNode's RPC latency.
Related Name
Default Value
Warning: 1 second(s), Critical: 5 second(s)
API Name
namenode_rpc_latency_thresholds
Required
false
NameNode RPC Latency Monitoring Window🔗
Description
The period to review when computing the moving average of the NameNode's RPC latency.
Related Name
Default Value
5 minute(s)
API Name
namenode_rpc_latency_window
Required
false
NameNode Safemode Health Test🔗
Description
Enables the health test that the NameNode is not in safemode
Related Name
Default Value
true
API Name
namenode_safe_mode_enabled
Required
false
NameNode Process Health Test🔗
Description
Enables the health test that the NameNode's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
namenode_scm_health_enabled
Required
false
Health Test Startup Tolerance🔗
Description
The amount of time allowed after this role is started that failures of health tests that rely on communication with this role will be tolerated.
Related Name
Default Value
5 minute(s)
API Name
namenode_startup_tolerance
Required
false
HDFS Metadata Upgrade Status Health Test🔗
Description
Enables the health test of the metadata upgrade status of the NameNode. This covers nonrolling metadata upgrades. Rolling metadata upgrades are covered in a separate health test.
Related Name
Default Value
true
API Name
namenode_upgrade_status_enabled
Required
false
Web Metric Collection🔗
Description
Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server.
Related Name
Default Value
true
API Name
namenode_web_metric_collection_enabled
Required
false
Web Metric Collection Duration🔗
Description
The health test thresholds on the duration of the metrics request to the web server.
Related Name
Default Value
Warning: 10 second(s), Critical: Never
API Name
namenode_web_metric_collection_thresholds
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
Backup and Disaster Log Retention🔗
Description
Maximum age of log files related to backup and disaster recovery.
Related Name
Default Value
90 day(s)
API Name
bdr_log_expiration_days
Required
false
Access Time Precision🔗
Description
The access time for HDFS file is precise upto this value. Setting the value of 0 disables access times for HDFS. When using the NFS Gateway role, make sure this property is enabled.
Related Name
dfs.access.time.precision
Default Value
1 hour(s)
API Name
dfs_access_time_precision
Required
false
Decommissioning blocks per interval🔗
Description
The approximate number of blocks to process per decommission interval, as defined in dfs.namenode.decommission.interval.
Related Name
dfs.namenode.decommission.blocks.per.interval
Default Value
500000
API Name
dfs_decommission_blocks_per_interval
Required
false
Decommissioning max tracked nodes🔗
Description
The maximum number of decommission-in-progress datanodes nodes that will be tracked at one time by the namenode. Tracking a decommission-in-progress datanode consumes additional NN memory proportional to the number of blocks on the datnode. A value of 0 means no limit will be enforced.
Determines where on the local file system the NameNode should store the name table (fsimage). For redundancy, enter a comma-delimited list of directories to replicate the name table in all of the directories. Typical values are /data/N/dfs/nn where N=1..3.
Related Name
dfs.namenode.name.dir
Default Value
API Name
dfs_name_dir_list
Required
true
Restore NameNode Directories at Checkpoint Time🔗
Description
If set to false and if one of the replicas of the NameNode storage fails, such as temporarily failure of NFS, this directory is not used until the NameNode restarts. If enabled, failed storage is re-checked on every checkpoint and, if it becomes valid, the NameNode will try to restore the edits and fsimage.
Related Name
dfs.namenode.name.dir.restore
Default Value
false
API Name
dfs_name_dir_restore
Required
false
NameNode Edits Directories🔗
Description
Directories on the local file system to store the NameNode edits. If not set, the edits are stored in the NameNode's Data Directories. The value of this configuration is automatically generated to be the Quorum-based Storage URI if there are JournalNodes and this NameNode is not Highly Available.
Related Name
dfs.namenode.edits.dir
Default Value
API Name
dfs_namenode_edits_dir
Required
false
Shared Edits Directory🔗
Description
Directory on a shared storage device, such as a Quorum-based Storage URI or a local directory that is an NFS mount from a NAS, to store the NameNode edits. The value of this configuration is automatically generated to be the Quourm Journal URI if there are JournalNodes and this NameNode is Highly Available.
Related Name
dfs.namenode.shared.edits.dir
Default Value
API Name
dfs_namenode_shared_edits_dir
Required
false
Safemode Extension🔗
Description
Determines extension of safemode in milliseconds after the threshold level is reached.
Related Name
dfs.namenode.safemode.extension
Default Value
30 second(s)
API Name
dfs_safemode_extension
Required
false
Safemode Minimum DataNodes🔗
Description
Specifies the number of DataNodes that must be live before the name node exits safemode. Enter a value less than or equal to 0 to take the number of live DataNodes into account when deciding whether to remain in safemode during startup. Values greater than the number of DataNodes in the cluster will make safemode permanent.
Related Name
dfs.namenode.safemode.min.datanodes
Default Value
1
API Name
dfs_safemode_min_datanodes
Required
false
Filesystem Trash Checkpoint Interval🔗
Description
Number of minutes between trash checkpoints. After a .Trash directory checkpoint is created, the Filesystem Trash Interval will define the time until permanent deletion. If set to 0, the value will be considered equal to the Filesytem Trash Interval value, which can cause the permanent deletion of entries in Trash to take over twice as long. The value for this must not exceed the Filesystem Trash Interval value.
Related Name
fs.trash.checkpoint.interval
Default Value
1 hour(s)
API Name
fs_trash_checkpoint_interval
Required
false
Filesystem Trash Interval🔗
Description
Controls the number of minutes after which a trash checkpoint directory is deleted permanently. To disable the trash feature, enter 0. The checkpointing frequency of .Trash directory contents is separately controlled by Filesystem Trash Checkpoint Interval.
Related Name
fs.trash.interval
Default Value
1 day(s)
API Name
fs_trash_interval
Required
false
Topology Script File Name🔗
Description
Full path to a custom topology script on the host file system. The topology script is used to determine the rack location of nodes. If left blank, a topology script will be provided that uses your hosts' rack information, visible in the "Hosts" page.
Related Name
net.topology.script.file.name
Default Value
API Name
topology_script_file_name
Required
false
Performance🔗
NameNode Handler Count🔗
Description
The number of server threads for the NameNode.
Related Name
dfs.namenode.handler.count
Default Value
30
API Name
dfs_namenode_handler_count
Required
false
NameNode Service Handler Count🔗
Description
The number of server threads for the NameNode used for service calls. Only used when NameNode Service RPC Port is configured.
Related Name
dfs.namenode.service.handler.count
Default Value
30
API Name
dfs_namenode_service_handler_count
Required
false
HDFS Thrift Server Max Threadcount🔗
Description
Maximum number of running threads for the HDFS Thrift server running on the NameNode
Related Name
dfs.thrift.threads.max
Default Value
20
API Name
dfs_thrift_threads_max
Required
false
HDFS Thrift Server Min Threadcount🔗
Description
Minimum number of running threads for the HDFS Thrift server running on the NameNode
Related Name
dfs.thrift.threads.min
Default Value
10
API Name
dfs_thrift_threads_min
Required
false
HDFS Thrift Server Timeout🔗
Description
Timeout in seconds for the HDFS Thrift server running on the NameNode
Related Name
dfs.thrift.timeout
Default Value
60
API Name
dfs_thrift_timeout
Required
false
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
NameNode Web UI Port🔗
Description
The base port where the DFS NameNode web UI listens. If the port number is 0, then the server starts on a free port. Combined with the NameNode's hostname to build its HTTP address.
Related Name
dfs.namenode.http-address
Default Value
9870
API Name
dfs_http_port
Required
false
Secure NameNode Web UI Port (TLS/SSL)🔗
Description
The base port where the secure NameNode web UI listens.
Related Name
dfs.https.port
Default Value
9871
API Name
dfs_https_port
Required
false
NameNode Service RPC Port🔗
Description
Optional port for the service-rpc address which can be used by HDFS daemons instead of sharing the RPC address used by the clients.
Related Name
dfs.namenode.servicerpc-address
Default Value
API Name
dfs_namenode_servicerpc_address
Required
false
Bind NameNode to Wildcard Address🔗
Description
If enabled, the NameNode binds to the wildcard address ("0.0.0.0") on all of its ports.
Related Name
Default Value
false
API Name
namenode_bind_wildcard
Required
false
NameNode Port🔗
Description
The port where the NameNode runs the HDFS protocol. Combined with the NameNode's hostname to build its address.
Related Name
fs.defaultFS
Default Value
8020
API Name
namenode_port
Required
false
Replication🔗
Safemode Threshold Percentage🔗
Description
Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Enter a value less than or equal to 0 to wait for any particular percentage of blocks before exiting safemode. Values greater than 1 will make safemode permanent.
Related Name
dfs.namenode.safemode.threshold-pct
Default Value
0.999
API Name
dfs_safemode_threshold_pct
Required
false
Resource Management🔗
Java Heap Size of NameNode in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
4 GiB
API Name
namenode_java_heapsize
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Security🔗
Include Caller Context in Audit Logs🔗
Description
When enabled, additional fields are written into NameNode audit log records for auditing coarse granularity operations.
Related Name
hadoop.caller.context.enabled
Default Value
true
API Name
hadoop_caller_context_enabled
Required
false
HDFS NameNode TLS/SSL Trust Store File🔗
Description
The location on disk of the trust store, in .jks format, used to confirm the authenticity of TLS/SSL servers that HDFS NameNode might connect to. This trust store must contain the certificate(s) used to sign the service(s) connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead.
Related Name
Default Value
API Name
namenode_truststore_file
Required
false
HDFS NameNode TLS/SSL Trust Store Password🔗
Description
The password for the HDFS NameNode TLS/SSL Trust Store File. This password is not required to access the trust store; this field can be left blank. This password provides optional integrity checking of the file. The contents of trust stores are certificates, and certificates are public information.
Related Name
Default Value
API Name
namenode_truststore_password
Required
false
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_all_hosts.txt parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_allow.txt parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_exclude.txt parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Environment Advanced Configuration Snippet (Safety Valve) parameter.
Suppress Configuration Validator: Validates Nameservices do not conflict between base and compute clusters.🔗
Description
Whether to suppress configuration warnings produced by the Validates Nameservices do not conflict between base and compute clusters. configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NameNode Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml parameter.
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Topology Script File Name parameter.
Related Name
Default Value
false
API Name
role_config_suppression_topology_script_file_name
Required
true
Suppress Health Test: Audit Pipeline Test🔗
Description
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_audit_health
Required
true
Suppress Health Test: NameNode Data Directories Free Space🔗
Description
Whether to suppress the results of the NameNode Data Directories Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Name Directory Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_file_descriptor
Required
true
Suppress Health Test: Checkpoint Status🔗
Description
Whether to suppress the results of the Checkpoint Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_host_health
Required
true
Suppress Health Test: JournalNode Sync Status🔗
Description
Whether to suppress the results of the JournalNode Sync Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_pause_duration
Required
true
Suppress Health Test: Rolling Upgrade Status🔗
Description
Whether to suppress the results of the Rolling Upgrade Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the RPC Latency heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_rpc_latency
Required
true
Suppress Health Test: Safe Mode Status🔗
Description
Whether to suppress the results of the Safe Mode Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_safe_mode
Required
true
Suppress Health Test: Process Status🔗
Description
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_scm_health
Required
true
Suppress Health Test: Swap Memory Usage🔗
Description
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Upgrade Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_name_node_upgrade_status
Required
true
Suppress Health Test: Web Server Status🔗
Description
Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
NFS Gateway Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
nfsgateway_config_safety_valve
Required
false
Java Configuration Options for NFS Gateway🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
Related Name
Default Value
API Name
NFSGATEWAY_role_env_safety_valve
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
false
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
Logs🔗
NFS Gateway Logging Threshold🔗
Description
The minimum log level for NFS Gateway logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
NFS Gateway Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for NFS Gateway logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
NFS Gateway Max Log Size🔗
Description
The maximum size, in megabytes, per log file for NFS Gateway logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
NFS Gateway Log Directory🔗
Description
Directory where NFS Gateway will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
nfsgateway_log_dir
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Temporary Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's Temporary Dump Directory.
Temporary Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's Temporary Dump Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Temporary Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit.
Related Name
Default Value
Warning: 50.0 %, Critical: 70.0 %
API Name
nfsgateway_fd_thresholds
Required
false
NFS Gateway Host Health Test🔗
Description
When computing the overall NFS Gateway health, consider the host's health.
Related Name
Default Value
true
API Name
nfsgateway_host_health_enabled
Required
false
NFS Gateway Process Health Test🔗
Description
Enables the health test that the NFS Gateway's process state is consistent with the role configuration
Related Name
Default Value
true
API Name
nfsgateway_scm_health_enabled
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
Unexpected Exits Thresholds🔗
Description
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
Temporary Dump Directory🔗
Description
NFS clients often reorder writes. As a result, sequential writes can arrive at the NFS Gateway in random order. This directory is used to temporarily save out-of-order writes before writing to HDFS. For each file, the out-of-order writes are dumped after they are accumulated to exceed certain threshold (e.g., 1MB) in memory. Please make sure this directory has enough space. For example, if the application uploads 10 files with each having 100MB, it is recommended that this directory have roughly 1GB of space in case write reorder happens (in the worst case) to every file.
Related Name
dfs.nfs3.dump.dir
Default Value
/tmp/.hdfs-nfs
API Name
dfs_nfs3_dump_dir
Required
false
Allowed Hosts and Privileges🔗
Description
By default, NFS Gateway exported directories can be mounted by any client. For better access control, update this property with a list of host names and access privileges separated by whitespace characters. Host name format can be a single host, a Java regular expression, or an IPv4 address. The access privilege uses rw to specify readwrite and ro to specify readonly access. If the access privilege is not provided, the default is read-only. Examples of host name format and access privilege: "192.168.0.0/22 rw", "host.*.example.com", "host1.test.org ro".
Related Name
dfs.nfs.exports.allowed.hosts
Default Value
* rw
API Name
dfs_nfs_exports_allowed_hosts
Required
false
NFS Gateway Export Point🔗
Description
The NFS Gateway export point(s). Full path is required. In federated clusters, multiple export points can be configured, and at most 1 export point per federated nameservice is allowed.
Related Name
nfs.export.point
Default Value
API Name
nfs_export_point
Required
false
Performance🔗
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
NFS Gateway Web UI Port🔗
Description
The base port where the NFS Gateway server web UI listens. Combined with the NFS Gateway server hostname to build its HTTP address.
Related Name
nfs.http.port
Default Value
50079
API Name
nfs3_http_port
Required
false
Secure NFS Gateway Web UI Port (TLS/SSL)🔗
Description
The base port where the secure NFS Gateway server web UI listens. Combined with the NFS Gateway server's hostname to build its secure web UI address.
Related Name
nfs.https.port
Default Value
50579
API Name
nfs3_https_port
Required
false
NFS Gateway MountD Port🔗
Description
The port number of the mount daemon implemented inside the NFS Gateway server role.
Related Name
nfs3.mountd.port
Default Value
4242
API Name
nfs3_mountd_port
Required
false
Portmap (or Rpcbind) Port🔗
Description
The port number of the system portmap or rpcbind service. This configuration is used by Cloudera Manager to verify if the system portmap or rpcbind service is running before starting NFS Gateway role. Cloudera Manager does not manage the system portmap or rpcbind service.
Related Name
Default Value
111
API Name
nfs3_portmap_port
Required
false
NFS Gateway Server Port🔗
Description
The NFS Gateway server port.
Related Name
nfs3.server.port
Default Value
2049
API Name
nfs3_server_port
Required
false
Resource Management🔗
Java Heap Size of NFS Gateway in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
256 MiB
API Name
nfsgateway_java_heapsize
Required
false
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NFS Gateway Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log_event_whitelist
Required
true
Suppress Parameter Validation: NFS Gateway Web UI Port🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the NFS Gateway Web UI Port parameter.
Related Name
Default Value
false
API Name
role_config_suppression_nfs3_http_port
Required
true
Suppress Parameter Validation: Secure NFS Gateway Web UI Port (TLS/SSL)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Secure NFS Gateway Web UI Port (TLS/SSL) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NFS Gateway Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the NFS Gateway Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_nfsgateway_audit_health
Required
true
Suppress Health Test: Temporary Dump Directory Free Space🔗
Description
Whether to suppress the results of the Temporary Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_nfsgateway_host_health
Required
true
Suppress Health Test: Log Directory Free Space🔗
Description
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
role_health_suppression_nfsgateway_scm_health
Required
true
Suppress Health Test: Swap Memory Usage🔗
Description
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
For advanced use only, a string to be inserted into log4j.properties for this role only.
Related Name
Default Value
API Name
log4j_safety_valve
Required
false
Enable auto refresh for metric configurations🔗
Description
When true, Enable Metric Collection and Metric Filter parameters will be set automatically if they're changed. Otherwise, a refresh by hand is required.
Related Name
Default Value
false
API Name
metric_config_auto_refresh
Required
false
Heap Dump Directory🔗
Description
Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, it will be owned by the current role user with 1777 permissions. Sharing the same directory among multiple roles will cause an ownership race. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role.
Related Name
oom_heap_dump_dir
Default Value
/tmp
API Name
oom_heap_dump_dir
Required
false
Dump Heap When Out of Memory🔗
Description
When set, generates a heap dump file when when an out-of-memory error occurs.
Related Name
Default Value
true
API Name
oom_heap_dump_enabled
Required
true
Kill When Out of Memory🔗
Description
When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown.
Related Name
Default Value
true
API Name
oom_sigkill_enabled
Required
true
Automatically Restart Process🔗
Description
When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. This configuration applies in the time after the Start Wait Timeout period.
Related Name
Default Value
false
API Name
process_auto_restart
Required
true
Enable Metric Collection🔗
Description
Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process.
Related Name
Default Value
true
API Name
process_should_monitor
Required
true
Process Start Retry Attempts🔗
Description
Number of times to try starting a role's process when the process exits before the Start Wait Timeout period. After a process is running beyond the Start Wait Timeout, the retry count is reset. Setting this configuration to zero will prevent restart of the process during the Start Wait Timeout period.
Related Name
Default Value
3
API Name
process_start_retries
Required
false
Process Start Wait Timeout🔗
Description
The time in seconds to wait for a role's process to start successfully on a host. Processes which exit/crash before this time will be restarted until reaching the limit specified by the Start Retry Attempts count parameter. Setting this configuration to zero will turn off this feature.
Related Name
Default Value
20
API Name
process_start_secs
Required
false
SecondaryNameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only. A string to be inserted into hdfs-site.xml for this role only.
Related Name
Default Value
API Name
secondarynamenode_config_safety_valve
Required
false
Java Configuration Options for Secondary NameNode🔗
Description
These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. Note: When CM version is 6.3.0 or greater, {{JAVA_GC_ARGS}} will be replaced by JVM Garbage Collection arguments based on the runtime Java JVM version.
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration.
Related Name
Default Value
API Name
SECONDARYNAMENODE_role_env_safety_valve
Required
false
Checkpointing🔗
Filesystem Checkpoint Period🔗
Description
The time between two periodic file system checkpoints.
Related Name
dfs.namenode.checkpoint.period
Default Value
1 hour(s)
API Name
fs_checkpoint_period
Required
false
Filesystem Checkpoint Transaction Threshold🔗
Description
The number of transactions after which the NameNode or SecondaryNameNode will create a checkpoint of the namespace, regardless of whether the checkpoint period has expired.
Related Name
dfs.namenode.checkpoint.txns
Default Value
1000000
API Name
fs_checkpoint_txns
Required
false
Logs🔗
SecondaryNameNode Logging Threshold🔗
Description
The minimum log level for SecondaryNameNode logs
Related Name
Default Value
INFO
API Name
log_threshold
Required
false
SecondaryNameNode Maximum Log File Backups🔗
Description
The maximum number of rolled log files to keep for SecondaryNameNode logs. Typically used by log4j or logback.
Related Name
Default Value
10
API Name
max_log_backup_index
Required
false
SecondaryNameNode Max Log Size🔗
Description
The maximum size, in megabytes, per log file for SecondaryNameNode logs. Typically used by log4j or logback.
Related Name
Default Value
200 MiB
API Name
max_log_size
Required
false
SecondaryNameNode Log Directory🔗
Description
Directory where SecondaryNameNode will place its log files.
Related Name
hadoop.log.dir
Default Value
/var/log/hadoop-hdfs
API Name
secondarynamenode_log_dir
Required
false
Monitoring🔗
Enable Health Alerts for this Role🔗
Description
When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Heap Dump Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory.
Heap Dump Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured.
Log Directory Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory.
Related Name
Default Value
Warning: 10 GiB, Critical: 5 GiB
API Name
log_directory_free_space_absolute_thresholds
Required
false
Log Directory Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
log_directory_free_space_percentage_thresholds
Required
false
Rules to Extract Events from Log Files🔗
Description
This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
rate(mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
content - match only those messages for which contents match this regular expression.
exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
{"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
{"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server.
Related Name
mgmt.navigator.failure.thresholds
Default Value
Warning: Never, Critical: Any
API Name
mgmt_navigator_failure_thresholds
Required
false
Monitoring Period For Audit Failures🔗
Description
The period to review when checking if audits are blocked and not getting processed.
Related Name
mgmt.navigator.failure.window
Default Value
20 minute(s)
API Name
mgmt_navigator_failure_window
Required
false
Navigator Audit Pipeline Health Check🔗
Description
Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit.
Related Name
mgmt.navigator.status.check.enabled
Default Value
true
API Name
mgmt_navigator_status_check_enabled
Required
false
Metric Filter🔗
Description
Defines a Metric Filter for this role. Cloudera Manager Agents will not send filtered metrics to the Service Monitor. Define the following fields:
Health Test Metric Set - Select this parameter to collect only metrics required for health tests.
Default Dashboard Metric Set - Select this parameter to collect only metrics required for the default dashboards. For user-defined charts, you must add the metrics you require for the chart using the Custom Metrics parameter.
Include/Exclude Custom Metrics - Select Include to specify metrics that should be collected. Select Exclude to specify metrics that should not be collected. Enter the metric names to be included or excluded using the Metric Name parameter.
Metric Name - The name of a metric that will be included or excluded during metric collection.
If you do not select Health Test Metric Set or Default Dashboard Metric Set, or specify metrics by name, metric filtering will be turned off (this is the default behavior).For example, the following configuration enables the collection of metrics required for Health Tests and the jvm_heap_used_mb metric:
Include only Health Test Metric Set: Selected.
Include/Exclude Custom Metrics: Set to Include.
Metric Name: jvm_heap_used_mb
You can also view the JSON representation for this parameter by clicking View as JSON. In this example, the JSON looks like this:{
"includeHealthTestMetricSet": true,
"filterType": "whitelist",
"metrics": ["jvm_heap_used_mb"]
}
Related Name
Default Value
API Name
monitoring_metric_filter
Required
false
Swap Memory Usage Rate Thresholds🔗
Description
The health test thresholds on the swap memory usage rate of the process. Specified as the change of the used swap memory during the predefined period.
Related Name
Default Value
Warning: Never, Critical: Never
API Name
process_swap_memory_rate_thresholds
Required
false
Swap Memory Usage Rate Window🔗
Description
The period to review when computing unexpected swap memory usage change of the process.
Related Name
common.process.swap_memory_rate_window
Default Value
5 minute(s)
API Name
process_swap_memory_rate_window
Required
false
Process Swap Memory Thresholds🔗
Description
The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold.
Related Name
Default Value
Warning: 200 B, Critical: Never
API Name
process_swap_memory_thresholds
Required
false
Role Triggers🔗
Description
The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific role.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad",
"streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
role_triggers
Required
true
HDFS Checkpoint Directories Free Space Monitoring Absolute Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's HDFS Checkpoint Directories.
HDFS Checkpoint Directories Free Space Monitoring Percentage Thresholds🔗
Description
The health test thresholds for monitoring of free space on the filesystem that contains this role's HDFS Checkpoint Directories. Specified as a percentage of the capacity on that filesystem. This setting is not used if a HDFS Checkpoint Directories Free Space Monitoring Absolute Thresholds setting is configured.
The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
unexpected_exits_thresholds
Required
false
Unexpected Exits Monitoring Period🔗
Description
The period to review when computing unexpected exits.
Related Name
Default Value
5 minute(s)
API Name
unexpected_exits_window
Required
false
Other🔗
HDFS Checkpoint Directories🔗
Description
Determines where on the local file system the HDFS SecondaryNameNode should store the temporary images to merge. For redundancy, enter a comma-delimited list of directories to replicate the image in all of the directories. Typical values are /data/N/dfs/snn for N = 1, 2, 3...
Related Name
dfs.namenode.checkpoint.dir
Default Value
API Name
fs_checkpoint_dir_list
Required
true
Performance🔗
Maximum Process File Descriptors🔗
Description
If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value.
Related Name
Default Value
API Name
rlimit_fds
Required
false
Ports and Addresses🔗
SecondaryNameNode Web UI Port🔗
Description
The SecondaryNameNode HTTP port. If the port is 0, then the server starts on a free port. Combined with the SecondaryNameNode's hostname to build its HTTP address.
Related Name
dfs.namenode.secondary.http-address
Default Value
9868
API Name
dfs_secondary_http_port
Required
false
Secure SecondaryNameNode Web UI Port (TLS/SSL)🔗
Description
The base port where the secure SecondaryNameNode web UI listens.
Related Name
dfs.secondary.https.port
Default Value
9869
API Name
dfs_secondary_https_port
Required
false
Bind SecondaryNameNode to Wildcard Address🔗
Description
If enabled, the SecondaryNameNode binds to the wildcard address ("0.0.0.0") on all of its ports.
Related Name
Default Value
false
API Name
secondary_namenode_bind_wildcard
Required
false
Resource Management🔗
Cgroup CPU Shares🔗
Description
Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager.
Related Name
cpu.shares
Default Value
1024
API Name
rm_cpu_shares
Required
true
Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Custom control group resources to assign to this role, which will be enforced by the Linux kernel. These resources should exist on the target hosts, otherwise an error will occur when the process starts. Use the same format as used for arguments to the cgexec command: resource1,resource2:path1 or resource3:path2 For example: 'cpu,memory:my/path blkio:my2/path2' ***These settings override other cgroup settings.***
Related Name
custom.cgroups
Default Value
API Name
rm_custom_resources
Required
false
Cgroup I/O Weight🔗
Description
Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager.
Related Name
blkio.weight
Default Value
500
API Name
rm_io_weight
Required
true
Cgroup Memory Hard Limit🔗
Description
Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_hard_limit
Required
true
Cgroup Memory Soft Limit🔗
Description
Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 to specify no limit. By default processes not managed by Cloudera Manager will have no limit. If the value is -1, Cloudera Manager will not monitor Cgroup memory usage therefore some of the charts will show 'No Data'
Related Name
memory.soft_limit_in_bytes
Default Value
-1 MiB
API Name
rm_memory_soft_limit
Required
true
Java Heap Size of Secondary NameNode in Bytes🔗
Description
Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx.
Related Name
Default Value
4 GiB
API Name
secondary_namenode_java_heapsize
Required
false
Stacks Collection🔗
Stacks Collection Data Retention🔗
Description
The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted.
Related Name
stacks_collection_data_retention
Default Value
100 MiB
API Name
stacks_collection_data_retention
Required
false
Stacks Collection Directory🔗
Description
The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. If this directory already exists, it will be owned by the current role user with 755 permissions. Sharing the same directory among multiple roles will cause an ownership race.
Related Name
stacks_collection_directory
Default Value
API Name
stacks_collection_directory
Required
false
Stacks Collection Enabled🔗
Description
Whether or not periodic stacks collection is enabled.
Related Name
stacks_collection_enabled
Default Value
false
API Name
stacks_collection_enabled
Required
true
Stacks Collection Frequency🔗
Description
The frequency with which stacks are collected.
Related Name
stacks_collection_frequency
Default Value
5.0 second(s)
API Name
stacks_collection_frequency
Required
false
Stacks Collection Method🔗
Description
The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped.
Related Name
stacks_collection_method
Default Value
jstack
API Name
stacks_collection_method
Required
false
Suppressions🔗
Suppress Configuration Validator: CDH Version Validator🔗
Description
Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator.
Related Name
Default Value
false
API Name
role_config_suppression_cdh_version_validator
Required
true
Suppress Parameter Validation: SecondaryNameNode Web UI Port🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the SecondaryNameNode Web UI Port parameter.
Related Name
Default Value
false
API Name
role_config_suppression_dfs_secondary_http_port
Required
true
Suppress Parameter Validation: Secure SecondaryNameNode Web UI Port (TLS/SSL)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Secure SecondaryNameNode Web UI Port (TLS/SSL) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the SecondaryNameNode Logging Advanced Configuration Snippet (Safety Valve) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_log4j_safety_valve
Required
true
Suppress Parameter Validation: Rules to Extract Events from Log Files🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter.
Related Name
Default Value
false
API Name
role_config_suppression_oom_heap_dump_dir
Required
true
Suppress Parameter Validation: Custom Control Group Resources (overrides Cgroup settings)🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Control Group Resources (overrides Cgroup settings) parameter.
Related Name
Default Value
false
API Name
role_config_suppression_rm_custom_resources
Required
true
Suppress Parameter Validation: Role Triggers🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the SecondaryNameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Suppress Parameter Validation: Java Configuration Options for Secondary NameNode🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Secondary NameNode parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the SecondaryNameNode Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: HDFS Checkpoint Directories Free Space🔗
Description
Whether to suppress the results of the HDFS Checkpoint Directories Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Heap Dump Directory Free Space🔗
Description
Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Swap Memory Usage Rate Beta🔗
Description
Whether to suppress the results of the Swap Memory Usage Rate Beta heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml🔗
Description
For advanced use only, a string to be inserted into core-site.xml. Applies to all roles and client configurations in this HDFS service as well as all its dependent services. Any configs added here will be overridden by their default values in HDFS (which can be found in hdfs-default.xml).
Related Name
Default Value
API Name
core_site_safety_valve
Required
false
Enable HDFS Block Metadata API🔗
Description
Enables DataNode support for the experimental DistributedFileSystem.getFileVBlockStorageLocations API. Applicable to CDH 4.1 and onwards.
Related Name
dfs.datanode.hdfs-blocks-metadata.enabled
Default Value
true
API Name
dfs_datanode_hdfs_blocks_metadata_enabled
Required
false
HDFS Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml🔗
Description
For advanced use only, a string to be inserted into hadoop-policy.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
hadoop_policy_config_safety_valve
Required
false
Block Replica Placement Policy🔗
Description
The policy the NameNode will use to place block replicas: The HDFS Default policy places one replica on the node where the client process writing the block resides, one on a randomly-chosen remote rack, and one on a randomly-chosen node in the same remote rack (assuming a replication factor of 3). The Upgrade Domains policy adds an additional layer of grouping based on Upgrade Domain, and must be selected in order to use Upgrade Domains for DataNode hosts.
For advanced use only, key-value pairs (one on each line) to be inserted into the environment of HDFS replication jobs.
Related Name
Default Value
API Name
hdfs_replication_env_safety_valve
Required
false
HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh🔗
Description
For advanced use only. Key-value pairs (one on each line) to be inserted into the HDFS replication configuration for hadoop-env.sh.
Related Name
Default Value
API Name
hdfs_replication_haoop_env_sh_safety_valve
Required
false
HDFS Replication Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only, a string to be inserted into hdfs-site.xml. Applies to all HDFS Replication jobs.
Related Name
Default Value
API Name
hdfs_replication_hdfs_site_safety_valve
Required
false
HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
For advanced use only, a string to be inserted into hdfs-site.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
hdfs_service_config_safety_valve
Required
false
HDFS Service Environment Advanced Configuration Snippet (Safety Valve)🔗
Description
For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration.
For advanced use only, key-value pairs (one on each line) to be inserted into the environment of HDFS snapshot shell command.
Related Name
Default Value
API Name
hdfs_shell_cmd_env_safety_valve
Required
false
HDFS Advanced Configuration Snippet (Safety Valve) for ssl-client.xml🔗
Description
For advanced use only, a string to be inserted into ssl-client.xml. Applies cluster-wide, but can be overridden by individual services.
Related Name
Default Value
API Name
hdfs_ssl_client_safety_valve
Required
false
HDFS Service Advanced Configuration Snippet (Safety Valve) for ssl-server.xml🔗
Description
For advanced use only, a string to be inserted into ssl-server.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
hdfs_ssl_server_safety_valve
Required
false
System User's Home Directory🔗
Description
The home directory of the system user on the local filesystem. This setting must reflect the system's configured value - only changing it here will not change the actual home directory.
Related Name
Default Value
/var/lib/hadoop-hdfs
API Name
hdfs_user_home_dir
Required
true
HDFS Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties🔗
Description
For advanced use only, a string to be inserted into the client configuration for navigator.client.properties.
Related Name
Default Value
API Name
navigator_client_config_safety_valve
Required
false
System Group🔗
Description
The group that this service's processes should run as (except the HttpFS server, which has its own group)
Related Name
Default Value
hdfs
API Name
process_groupname
Required
true
System User🔗
Description
The user that this service's processes should run as (except the HttpFS server, which has its own user)
Related Name
Default Value
hdfs
API Name
process_username
Required
true
HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-audit.xml🔗
Description
For advanced use only, a string to be inserted into ranger-hdfs-audit.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
ranger_audit_safety_valve
Required
false
HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-policymgr-ssl.xml🔗
Description
For advanced use only, a string to be inserted into ranger-hdfs-policymgr-ssl.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
ranger_policymgr_ssl_safety_valve
Required
false
HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml🔗
Description
For advanced use only, a string to be inserted into ranger-hdfs-security.xml. Applies to configurations of all roles in this service except client configuration.
Related Name
Default Value
API Name
ranger_security_safety_valve
Required
false
Cloudera Navigator🔗
Enable Audit Collection🔗
Description
Enable collection of audit events from the service's roles.
Related Name
navigator.audit.enabled
Default Value
true
API Name
navigator_audit_enabled
Required
false
Audit Event Filter🔗
Description
Event filters are defined in a JSON object like the following:
{
"defaultAction" : ("accept", "discard"),
"rules" : [
{
"action" : ("accept", "discard"),
"fields" : [
{
"name" : "fieldName",
"match" : "regex"
}
]
}
]
}
A filter has a default action and a list of rules, in order of precedence.
Each rule defines an action, and a list of fields to match against the
audit event.
A rule is "accepted" if all the listed field entries match the audit
event. At that point, the action declared by the rule is taken.
If no rules match the event, the default action is taken. Actions
default to "accept" if not defined in the JSON object.
The following is the list of fields that can be filtered for HDFS events:
username: the user performing the action.
ipAddress: the IP from where the request originated.
command: the HDFS operation being performed.
src: the source path for the operation.
dest: the destination path for the operation.
permissions: the permissions associated with the operation.
The default HDFS audit event filter accepts all denied access, delete
and rename events, and discards events that affects files in any of the
staging directories (Hive, Spark, Impala), events that affect files in /tmp
directory, events that affect files in Cloudera Hive Canary directory,
events generated by the internal Cloudera and Hadoop users (cloudera-scm,
dr.who, hbase, hive, impala, mapred, solr, and spark), and 'ls' actions
performed by the hdfs user.
Related Name
navigator.event.filter
Default Value
comment: [
The default HDFS audit event filter accepts all denied access, delete ,
and rename events, and discards events that affects files in any of the ,
staging directories (Hive, Spark, Impala), events that affect files in /tmp ,
directory, events that affect files in Cloudera Hive Canary directory, ,
events generated by the internal Cloudera and Hadoop users (cloudera-scm, ,
dr.who, hbase, hive, impala, mapred, solr, and spark), and \u0027ls\u0027 actions ,
performed by the hdfs user.
],
defaultAction: accept,
rules: [
action: accept,
fields: [
name: allowed,
match: (?:false)
]
,
action: discard,
fields: [
name: src,
match: (?:.*/\\.hive-staging($|.*)?|.*/\\.staging($|/.*)?|.*/\\.sparkStaging($|/.*)?|.*/_impala_insert_staging($|/.*)?|/user/history/done_intermediate(?:/.*)?|/user/spark/spark2ApplicationHistory($|/.*)|/user/spark/applicationHistory($|/.*)|/user/hue/\\.cloudera_manager_hive_metastore_canary(?:/.*)?|/user/hue/\\.Trash/Current/user/hue/\\.cloudera_manager_hive_metastore_canary(?:/.*)?|/tmp(?:/.*)?)
]
,
action: accept,
fields: [
name: operation,
match: delete|rename.*
]
,
action: discard,
fields: [
name: username,
match: (?:cloudera-scm|dr.who|hbase|hive|impala|mapred|solr|spark)(?:/.+)?
]
,
action: discard,
fields: [
name: username,
match: (?:hdfs)(?:/.+)?
,
name: operation,
match: (?:listStatus|listCachePools|listCacheDirectives|getfileinfo)
]
,
action: accept,
fields: [
name: operation,
match: (?:getfileinfo)
]
]
API Name
navigator_audit_event_filter
Required
false
Audit Queue Policy🔗
Description
Action to take when the audit event queue is full. Drop the event or shutdown the affected process.
Related Name
navigator.batch.queue_policy
Default Value
DROP
API Name
navigator_audit_queue_policy
Required
false
Audit Event Tracker🔗
Description
Configures the rules for event tracking and coalescing. This feature is
used to define equivalency between different audit events. When
events match, according to a set of configurable parameters, only one
entry in the audit list is generated for all the matching events.
Tracking works by keeping a reference to events when they first appear,
and comparing other incoming events against the "tracked" events according
to the rules defined here.
Event trackers are defined in a JSON object like the following:
{
"timeToLive" : [integer],
"fields" : [
{
"type" : [string],
"name" : [string]
}
]
}
Where:
timeToLive: maximum amount of time an event will be tracked, in
milliseconds. Must be provided. This defines how long, since it's
first seen, an event will be tracked. A value of 0 disables tracking.
fields: list of fields to compare when matching events against
tracked events.
Each field has an evaluator type associated with it. The evaluator defines
how the field data is to be compared. The following evaluators are
available:
value: uses the field value for comparison.
username: treats the field value as a user name, and ignores any
host-specific data. This is useful for environment using Kerberos,
so that only the principal name and realm are compared.
The following is the list of fields that can be used to compare HDFS events:
operation: the HDFS operation being performed.
username: the user performing the action.
ipAddress: the IP from where the request originated.
allowed: whether the operation was allowed or denied.
src: the source path for the operation.
dest: the destination path for the operation.
permissions: the permissions associated with the operation.
The default event tracker for HDFS services defines equality by comparing the
username, operation, and source path of the events.
Related Name
navigator_event_tracker
Default Value
comment: [
The default event tracker for HDFS services defines equality by ,
comparing the username, operation, and source path of the events.
],
timeToLive: 60000,
fields: [
type: value,
name: src
,
type: value,
name: operation
,
type: username,
name: username
]
API Name
navigator_event_tracker
Required
false
High Availability🔗
Timeout for Cloudera Manager Fencing Strategy🔗
Description
The timeout, in milliseconds, to use with the Cloudera Manager agent-based fencer.
Related Name
dfs.ha.fencing.cloudera_manager.timeout_millis
Default Value
10000
API Name
dfs_ha_fencing_cloudera_manager_timeout_millis
Required
false
HDFS High Availability Fencing Methods🔗
Description
List of fencing methods to use for service fencing. Setting this to shell(true) enables the built-in HDFS fencing mechanism, which causes the NameNode to exit if it attempts a write operation when it is not active. In almost all cases, this is the best choice. The sshfence method uses SSH. If using custom fencers (that may communicate with shared store, power units, or network switches), use the shell to invoke them.
Related Name
dfs.ha.fencing.methods
Default Value
shell(true)
API Name
dfs_ha_fencing_methods
Required
false
Timeout for SSH Fencing Strategy🔗
Description
SSH connection timeout, in milliseconds, to use with the built-in sshfence fencer.
Related Name
dfs.ha.fencing.ssh.connect-timeout
Default Value
30 second(s)
API Name
dfs_ha_fencing_ssh_connect_timeout
Required
false
Private Keys for SSH Fencing Strategy🔗
Description
The SSH private key files to use with the built-in sshfence fencer. These are to be accessible to the hdfs user on the machines running the NameNodes.
Related Name
dfs.ha.fencing.ssh.private-key-files
Default Value
API Name
dfs_ha_fencing_ssh_private_key_files
Required
false
FailoverProxyProvider Class🔗
Description
Enter a FailoverProxyProvider implementation to configure two URIs to connect to during fail-over. The first configured address is tried first, and on a fail-over event the other address is tried.
Path to the directory where audit logs will be written. The directory will be created if it doesn't exist.
Related Name
audit_event_log_dir
Default Value
/var/log/hadoop-hdfs/audit
API Name
audit_event_log_dir
Required
false
Maximum Audit Log File Size🔗
Description
Maximum size of audit log file in MB before it is rolled over.
Related Name
navigator.audit_log_max_file_size
Default Value
100 MiB
API Name
navigator_audit_log_max_file_size
Required
false
Number of Audit Logs to Retain🔗
Description
Maximum number of rolled-over audit logs to retain. The logs are not deleted if they contain audit events that have not yet been propagated to the Audit Server.
Related Name
navigator.client.max_num_audit_log
Default Value
10
API Name
navigator_client_max_num_audit_log
Required
false
Monitoring🔗
Enable Log Event Capture🔗
Description
When set, each role identifies important log events and forwards them to Cloudera Manager.
Related Name
Default Value
true
API Name
catch_events
Required
false
Enable Service Level Health Alerts🔗
Description
When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold
Related Name
Default Value
true
API Name
enable_alerts
Required
false
Enable Configuration Change Alerts🔗
Description
When set, Cloudera Manager will send alerts when this entity's configuration changes.
Related Name
Default Value
false
API Name
enable_config_alerts
Required
false
Failover Controllers Healthy🔗
Description
Enables the health check that verifies that the failover controllers associated with this service are healthy and running.
Related Name
Default Value
true
API Name
failover_controllers_healthy_enabled
Required
false
HDFS Health Canary Directory🔗
Description
The service monitor will use this directory to create files to test if the hdfs service is healthy. The directory and files are created with permissions specified by 'HDFS Health Canary Directory Permissions'
Related Name
Default Value
/tmp/.cloudera_health_monitoring_canary_files
API Name
firehose_hdfs_canary_directory
Required
false
HDFS Health Canary Directory Permissions🔗
Description
The service monitor will use these permissions to create the directory and files to test if the hdfs service is healthy. Permissions are specified using the 10-character unix-symbolic format e.g. '-rwxr-xr-x'.
Related Name
Default Value
-rwxrwxrwx
API Name
firehose_hdfs_canary_directory_permissions
Required
false
Active NameNode Detection Window🔗
Description
The tolerance window that will be used in HDFS service tests that depend on detection of the active NameNode.
Related Name
Default Value
3 minute(s)
API Name
hdfs_active_namenode_detection_window
Required
false
Blocks With Corrupt Replicas Monitoring Thresholds🔗
Description
The health check thresholds of the number of blocks that have at least one corrupt replica. Specified as a percentage of the total number of blocks.
Related Name
Default Value
Warning: 0.5 %, Critical: 1.0 %
API Name
hdfs_blocks_with_corrupt_replicas_thresholds
Required
false
HDFS Canary Health Check🔗
Description
Enables the health check that a client can create, read, write, and delete files
Related Name
Default Value
true
API Name
hdfs_canary_health_enabled
Required
false
Healthy DataNode Monitoring Thresholds🔗
Description
The health test thresholds of the overall DataNode health. The check returns "Concerning" health if the percentage of "Healthy" DataNodes falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" DataNodes falls below the critical threshold.
Related Name
Default Value
Warning: 95.0 %, Critical: 90.0 %
API Name
hdfs_datanodes_healthy_thresholds
Required
false
HDFS Free Space Monitoring Thresholds🔗
Description
The health check thresholds of free space in HDFS. Specified as a percentage of total HDFS capacity.
Related Name
Default Value
Warning: 20.0 %, Critical: 10.0 %
API Name
hdfs_free_space_thresholds
Required
false
Missing Block Monitoring Thresholds🔗
Description
The health check thresholds of the number of missing blocks. Specified as a percentage of the total number of blocks.
Related Name
Default Value
Warning: Never, Critical: Any
API Name
hdfs_missing_blocks_thresholds
Required
false
NameNode Activation Startup Tolerance🔗
Description
The amount of time after NameNode(s) start that the lack of an active NameNode will be tolerated. This is intended to allow either the auto-failover daemon to make a NameNode active, or a specifically issued failover command to take effect. This is an advanced option that does not often need to be changed.
Related Name
Default Value
3 minute(s)
API Name
hdfs_namenode_activation_startup_tolerance
Required
false
Active NameNode Role Health Check🔗
Description
When computing the overall HDFS cluster health, consider the active NameNode's health
Related Name
Default Value
true
API Name
hdfs_namenode_health_enabled
Required
false
Standby NameNode Health Check🔗
Description
When computing the overall HDFS cluster health, consider the health of the standby NameNode.
Related Name
Default Value
true
API Name
hdfs_standby_namenodes_health_enabled
Required
false
Under-replicated Block Monitoring Thresholds🔗
Description
The health check thresholds of the number of under-replicated blocks. Specified as a percentage of the total number of blocks.
Related Name
Default Value
Warning: 10.0 %, Critical: 40.0 %
API Name
hdfs_under_replicated_blocks_thresholds
Required
false
Erasure Coding Policy Verification Health Check🔗
Description
Enables the health test for verifying if the cluster topology supports all the enabled erasure coding policies.
Related Name
Default Value
false
API Name
hdfs_verify_ec_with_topology_enabled
Required
false
Log Event Retry Frequency🔗
Description
The frequency in which the log4j event publication appender will retry sending undelivered log events to the Event server, in seconds
Related Name
Default Value
30
API Name
log_event_retry_frequency
Required
false
Service Triggers🔗
Description
The configured triggers for this service. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
triggerName(mandatory) - The name of the trigger. This value must be unique for the specific service.
triggerExpression(mandatory) - A tsquery expression representing the trigger.
streamThreshold(optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger fires if there are more than 10 DataNodes with more than 500 file descriptors opened:[{"triggerName": "sample-trigger",
"triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:bad",
"streamThreshold": 10, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
Related Name
Default Value
[]
API Name
service_triggers
Required
true
Service Monitor Client Config Overrides🔗
Description
For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client configuration for the service.
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve)🔗
Description
For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones.
Related Name
Default Value
API Name
smon_derived_configs_safety_valve
Required
false
Other🔗
HDFS Block Size🔗
Description
The default block size in bytes for new HDFS files. Note that this value is also used as the HBase Region Server HLog block size.
Related Name
dfs.blocksize
Default Value
128 MiB
API Name
dfs_block_size
Required
false
Check HDFS Permissions🔗
Description
If false, permission checking is turned off for files in HDFS.
Related Name
dfs.permissions
Default Value
true
API Name
dfs_permissions
Required
false
Default Umask🔗
Description
Default umask for file and directory creation, specified in an octal value (with a leading 0)
Related Name
fs.permissions.umask-mode
Default Value
022
API Name
dfs_umaskmode
Required
false
Enable WebHDFS🔗
Description
Enable WebHDFS interface
Related Name
dfs.webhdfs.enabled
Default Value
true
API Name
dfs_webhdfs_enabled
Required
false
Serve logs over HTTP🔗
Description
Whether to serve logs over HTTP from HDFS web servers. This includes listing the logs directory at the /logs endpoint, which may be a security concern.
Related Name
hadoop.http.logs.enabled
Default Value
true
API Name
http_logs_enabled
Required
false
Compression Codecs🔗
Description
Comma-separated list of compression codecs that can be used in job or map compression.
The Key Management Server used by HDFS. This must be set to use encryption for data at rest.
Related Name
Default Value
API Name
kms_service
Required
false
Object Store Service🔗
Description
Select an Object Store service to enable cloud storage support. Once enabled, the cloud storage can be used in Impala and Hue services, via fully-qualified URIs.
Related Name
Default Value
API Name
object_store_service
Required
false
Ranger Plugin Trusted Proxy IP Address🔗
Description
Accepts a list of IP addresses of proxy servers for trusting.
Related Name
ranger.plugin.hdfs.trusted.proxy.ipaddress
Default Value
API Name
ranger_plugin_trusted_proxy_ipaddress
Required
false
Ranger Plugin Use X-Forwarded for IP Address🔗
Description
The parameter is used for identifying the originating IP address of a user connecting to a component through proxy for audit logs.
Related Name
ranger.plugin.hdfs.use.x-forwarded-for.ipaddress
Default Value
false
API Name
ranger_plugin_use_x_forwarded_for_ipaddress
Required
false
Set Rules to Map Kerberos Principals to Lower Case Short Names🔗
Description
Adds mapping rules to map Kerberos principals to lower case short names that will be inserted before the default rule. After changing this value and restarting the service, any services depending on this one must be restarted as well.
Related Name
Default Value
false
API Name
set_auth_to_local_to_lowercase
Required
false
ZooKeeper Service🔗
Description
Name of the ZooKeeper service that this HDFS service instance depends on
Related Name
Default Value
API Name
zookeeper_service
Required
false
Performance🔗
DataNode Local Path Access Users🔗
Description
Comma separated list of users allowed to do short circuit read. A short circuit read allows a client co-located with the data to read HDFS file blocks directly from HDFS. If empty, will default to the DataNode process' user.
Related Name
dfs.block.local-path-access.user
Default Value
API Name
dfs_block_local_path_access_user
Required
false
HDFS File Block Storage Location Timeout🔗
Description
Timeout in milliseconds for the parallel RPCs made in DistributedFileSystem#getFileBlockStorageLocations(). This value is only emitted for Impala.
Enable HDFS short-circuit read. This allows a client colocated with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality.
Related Name
dfs.client.read.shortcircuit
Default Value
true
API Name
dfs_datanode_read_shortcircuit
Required
false
UNIX Domain Socket path🔗
Description
Path on the DataNode's local file system to a UNIX domain socket used for communication between the DataNode and local HDFS clients. This socket is used for Short Circuit Reads. Only the HDFS System User and "root" should have write access to the parent directory and all of its ancestors. This property is supported in CDH 4.2 or later deployments.
Related Name
dfs.domain.socket.path
Default Value
/var/run/hdfs-sockets/dn
API Name
dfs_domain_socket_path
Required
false
FsImage Transfer Bandwidth🔗
Description
Maximum bandwidth used for image transfer in bytes per second. This can help keep normal NameNode operations responsive during checkpointing. A default value of 0 indicates that throttling is disabled.
Related Name
dfs.image.transfer.bandwidthPerSec
Default Value
0 B
API Name
dfs_image_transfer_bandwidthPerSec
Required
false
FsImage Transfer Socket Timeout🔗
Description
Socket timeout for the HttpURLConnection instance used in the image transfer. This is measured in milliseconds. This timeout prevents client hangs if the connection is idle for this configured timeout, during image transfer.
Related Name
dfs.image.transfer.timeout
Default Value
1 minute(s)
API Name
dfs_image_transfer_timeout
Required
false
Ports and Addresses🔗
Use DataNode Hostname🔗
Description
Typically, HDFS clients and servers communicate by opening sockets via an IP address. In certain networking configurations, it is preferable to open sockets after doing a DNS lookup on the hostname. Enable this property to open sockets after doing a DNS lookup on the hostname. This property is supported in CDH3u4 or later deployments.
Related Name
dfs.client.use.datanode.hostname
Default Value
false
API Name
dfs_client_use_datanode_hostname
Required
false
Proxy🔗
HDFS Proxy User Groups🔗
Description
Comma-delimited list of groups to allow the HDFS user to impersonate. The default '*' allows all groups. To disable entirely, use a string that does not correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.hdfs.groups
Default Value
*
API Name
hdfs_proxy_user_groups_list
Required
false
HDFS Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the HDFS user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.hdfs.hosts
Default Value
*
API Name
hdfs_proxy_user_hosts_list
Required
false
Hive Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Hive user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.hive.groups
Default Value
*
API Name
hive_proxy_user_groups_list
Required
false
Hive Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Hive user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.hive.hosts
Default Value
*
API Name
hive_proxy_user_hosts_list
Required
false
HTTP Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the HTTP user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. This is used by WebHCat.
Related Name
hadoop.proxyuser.HTTP.groups
Default Value
*
API Name
HTTP_proxy_user_groups_list
Required
false
HTTP Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the HTTP user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. This is used by WebHCat.
Related Name
hadoop.proxyuser.HTTP.hosts
Default Value
*
API Name
HTTP_proxy_user_hosts_list
Required
false
HttpFS Proxy User Groups🔗
Description
Comma-delimited list of groups to allow the HttpFS user to impersonate. The default '*' allows all groups. To disable entirely, use a string that does not correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.httpfs.groups
Default Value
*
API Name
httpfs_proxy_user_groups_list
Required
false
HttpFS Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you allow the HttpFS user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.httpfs.hosts
Default Value
*
API Name
httpfs_proxy_user_hosts_list
Required
false
Hue Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Hue user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.hue.groups
Default Value
*
API Name
hue_proxy_user_groups_list
Required
false
Hue Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Hue user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.hue.hosts
Default Value
*
API Name
hue_proxy_user_hosts_list
Required
false
Impala Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Impala user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.impala.groups
Default Value
*
API Name
impala_proxy_user_groups_list
Required
false
Impala Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Impala user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.impala.hosts
Default Value
*
API Name
impala_proxy_user_hosts_list
Required
false
Knox Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Knox user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.knox.groups
Default Value
*
API Name
knox_proxy_user_groups_list
Required
false
Knox Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Knox user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.knox.hosts
Default Value
*
API Name
knox_proxy_user_hosts_list
Required
false
Livy Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Livy user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.livy.groups
Default Value
*
API Name
livy_proxy_user_groups_list
Required
false
Livy Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Livy user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.livy.hosts
Default Value
*
API Name
livy_proxy_user_hosts_list
Required
false
Oozie Proxy User Groups🔗
Description
Allows the oozie superuser to impersonate any members of a comma-delimited list of groups. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.oozie.groups
Default Value
*
API Name
oozie_proxy_user_groups_list
Required
false
Oozie Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the oozie user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.oozie.hosts
Default Value
*
API Name
oozie_proxy_user_hosts_list
Required
false
Phoenix Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the Phoenix user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.phoenix.groups
Default Value
*
API Name
phoenix_proxy_user_groups_list
Required
false
Phoenix Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Phoenix user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.phoenix.hosts
Default Value
*
API Name
phoenix_proxy_user_hosts_list
Required
false
Service Monitor Proxy User Groups🔗
Description
Allows the Cloudera Service Monitor user to impersonate any members of a comma-delimited list of groups. The default '*' allows all groups. This property is used only if Service Monitor is using a different Kerberos principal than the Hue service. To disable entirely, use a string that does not correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.smon.groups
Default Value
*
API Name
smon_proxy_user_groups_list
Required
false
Service Monitor Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Cloudera Service Monitor user to impersonate other users. The default '*' allows all hosts. This property is used only if Service Monitor is using a different Kerberos principal than the Hue service. To disable entirely, use a string that does not correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.smon.hosts
Default Value
*
API Name
smon_proxy_user_hosts_list
Required
false
Telemetry Publisher Proxy User Groups🔗
Description
Allows the Cloudera Telemetry Publisher user to impersonate any members of a comma-delimited list of groups. The default '*' allows all groups. This property is used only if Telemetry Publisher is using a different Kerberos principal than the Hue service. To disable entirely, use a string that does not correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.telepub.groups
Default Value
*
API Name
telepub_proxy_user_groups_list
Required
false
Telemetry Publisher Proxy User Hosts🔗
Description
Comma-delimited list of hosts where you want to allow the Cloudera Telemetry Publisher user to impersonate other users. The default '*' allows all hosts. This property is used only if Telemetry Publisher is using a different Kerberos principal than the Hue service. To disable entirely, use a string that does not correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.telepub.hosts
Default Value
*
API Name
telepub_proxy_user_hosts_list
Required
false
YARN Proxy User Groups🔗
Description
Comma-delimited list of groups that you want to allow the YARN user to impersonate. The default '*' allows all groups. To disable entirely, use a string that does not correspond to a group name, such as '_no_group_'.
Related Name
hadoop.proxyuser.yarn.groups
Default Value
*
API Name
yarn_proxy_user_groups_list
Required
false
YARN Proxy User Hosts🔗
Description
Comma-delimited list of hosts that you want to allow the YARN user to impersonate. The default '*' allows all hosts. To disable entirely, use a string that does not correspond to a host name, such as '_no_host'.
Related Name
hadoop.proxyuser.yarn.hosts
Default Value
*
API Name
yarn_proxy_user_hosts_list
Required
false
Replication🔗
Maintenance State Minimal Block Replication🔗
Description
The minimum number of block replicas required to enter Maintenance State. If any block has less than the minimum number of block replicas, the DataNode cannot immediately enter Maintenance State.
Related Name
dfs.namenode.maintenance.replication.min
Default Value
1
API Name
dfs_maintenance_replication_min
Required
false
Replication Factor🔗
Description
Default block replication. The number of replications to make when the file is created. The default value is used if a replication number is not specified.
Related Name
dfs.replication
Default Value
3
API Name
dfs_replication
Required
false
Maximal Block Replication🔗
Description
The maximal block replication.
Related Name
dfs.replication.max
Default Value
512
API Name
dfs_replication_max
Required
false
Minimal Block Replication🔗
Description
The minimal block replication.
Related Name
dfs.namenode.replication.min
Default Value
1
API Name
dfs_replication_min
Required
false
Security🔗
DataNode Data Transfer Protection🔗
Description
SASL protection mode for secured connections to the DataNodes when reading or writing data. Value is the type of SASL protection to be used for secured connections to the DataNode when reading or writing block data. Possible values are 'authentication', 'integrity' and 'privacy'. authentication means authentication only and no integrity or privacy; integrity implies that only authentication and integrity are enabled; and privacy implies all of authentication, integrity and privacy are enabled. If "Enable Data Transfer Encryption" is set to true, then it supersedes the setting for this parameter and enforces that all connections must use a specialized encrypted SASL handshake. This property is ignored for connections to a DataNode listening on a privileged port. In this case, it is assumed that the use of a privileged port establishes sufficient trust.
Related Name
dfs.data.transfer.protection
Default Value
API Name
dfs_data_transfer_protection
Required
false
Enable Data Transfer Encryption🔗
Description
Enable encryption of data transfer between DataNodes and clients, and among DataNodes. When enabled, block data that is read/written from/to HDFS will be encrypted on the wire. For effective data transfer protection, enable Kerberos authentication and pick privacy for "Hadoop RPC Protection".
Related Name
dfs.encrypt.data.transfer
Default Value
false
API Name
dfs_encrypt_data_transfer
Required
false
Data Transfer Encryption Algorithm🔗
Description
Algorithm to encrypt data transfer between DataNodes and clients, and among DataNodes. If 3des or rc4 are chosen, the entire communication is encrypted with that algorithm. In CDH 5.4 and higher, if AES/CTR/NoPadding is chosen, 3des is used for the initial key exchange, and then AES/CTR/NoPadding is used for the transfer of data. This is the most secure option, and is recommended for clusters running CDH 5.4 or higher. It also requires that the "openssl-devel" package be installed on all machines in the cluster. When this parameter is changed, a full, nonrolling restart of the cluster must be performed.
Related Name
dfs.encrypt.data.transfer.algorithm
Default Value
rc4
API Name
dfs_encrypt_data_transfer_algorithm
Required
false
Data Transfer Cipher Suite Key Strength🔗
Description
If AES/CTR/NoPadding is chosen for the Data Transfer Encryption Algorithm, this specifies the length (in bits) of the AES key. When this parameter is changed, a full, non-rolling restart of the cluster must be performed.
Related Name
dfs.encrypt.data.transfer.cipher.key.bitlength
Default Value
256
API Name
dfs_encrypt_data_transfer_cipher_keybits
Required
false
Enable Access Control Lists🔗
Description
ACLs (Access Control Lists) enhance the existing HDFS permission model to support controlling file access for arbitrary combinations of users and groups instead of a single owner, single group, and all other users. When ACLs are disabled, the NameNode rejects all attempts to set an ACL.
Related Name
dfs.namenode.acls.enabled
Default Value
true
API Name
dfs_namenode_acls_enabled
Required
false
Superuser Group🔗
Description
The name of the group of superusers.
Related Name
dfs.permissions.superusergroup
Default Value
supergroup
API Name
dfs_permissions_supergroup
Required
false
Enable Ranger Authorization🔗
Description
Enable fine-grained security using Ranger. There should be only one Ranger service installed in the same cluster as HDFS: this Ranger service should have the DFS dependency set to this HDFS service.
Related Name
Default Value
false
API Name
enable_ranger_authorization
Required
false
Additional Rules to Map Kerberos Principals to Short Names🔗
Description
Additional mapping rules that will be inserted before rules generated from the list of trusted realms and before the default rule. After changing this value and restarting the service, any services depending on this one must be restarted as well. The hadoop.security.auth_to_local property is configured using this information. Default rules are generated by Cloudera Manager and substituted in place of the literal {DEFAULT_RULES} if it is specified in this value.
Related Name
Default Value
DEFAULT_RULES
API Name
extra_auth_to_local_rules
Required
false
Authorized Admin Groups🔗
Description
Comma-separated list of groups authorized to perform admin operations on Hadoop. This is emitted only if authorization is enabled.
Related Name
Default Value
API Name
hadoop_authorized_admin_groups
Required
false
Authorized Admin Users🔗
Description
Comma-separated list of users authorized to perform admin operations on Hadoop. This is emitted only if authorization is enabled.
Related Name
Default Value
*
API Name
hadoop_authorized_admin_users
Required
false
Authorized Groups🔗
Description
Comma-separated list of groups authorized to used Hadoop. This is emitted only if authorization is enabled.
Related Name
Default Value
API Name
hadoop_authorized_groups
Required
false
Authorized Users🔗
Description
Comma-separated list of users authorized to used Hadoop. This is emitted only if authorization is enabled.
Related Name
Default Value
*
API Name
hadoop_authorized_users
Required
false
Hadoop User Group Mapping Search Base🔗
Description
The search base for the LDAP connection. This is a distinguished name, and will typically be the root of the LDAP directory.
Related Name
hadoop.security.group.mapping.ldap.base
Default Value
API Name
hadoop_group_mapping_ldap_base
Required
false
Hadoop User Group Mapping LDAP Bind User Password🔗
Description
The password of the bind user.
Related Name
hadoop.security.group.mapping.ldap.bind.password
Default Value
API Name
hadoop_group_mapping_ldap_bind_passwd
Required
false
Hadoop User Group Mapping LDAP Bind User Distinguished Name🔗
Description
Distinguished name of the user to bind to AD as for user authentication search/bind and group lookup for role authorization. For openLDAP based directories this should be a DN string, for Active Directory this can be just a username, combined with the "Active Directory Domain" value for login. For example username in the field and example.com in the active directory domain will result in the User Principal Name value of username@example.com being used to bind. If you put a UPM value here, do not over-configure the "active directory domain" field otherwise you will end up presenting username@example.com@example.com for binds.
AD will accept a UPN value or the DN value as a valid Bind DN;
An example of a Distinguished Name (DN): CN=cdh admin,OU=svcaccount,DC=example,DC=com
An example of a UPN value: cdhadmin@example.com
Related Name
hadoop.security.group.mapping.ldap.bind.user
Default Value
API Name
hadoop_group_mapping_ldap_bind_user
Required
false
Hadoop User Group Mapping LDAP Group Search Filter🔗
Description
An additional filter to use when searching for groups.
Hadoop User Group Mapping LDAP TLS/SSL Truststore🔗
Description
File path to a jks-format truststore containing the TLS/SSL certificate used sign the LDAP server's certificate. Note that in previous releases this was erroneously referred to as a "keystore".
Related Name
hadoop.security.group.mapping.ldap.ssl.keystore
Default Value
API Name
hadoop_group_mapping_ldap_keystore
Required
false
Hadoop User Group Mapping LDAP TLS/SSL Truststore Password🔗
Hadoop User Group Mapping LDAP Group Membership Attribute🔗
Description
The attribute of the group object that identifies the users that are members of the group. The default will usually be appropriate for any LDAP installation.
The URL of the LDAP Server. The URL must be prefixed with ldap:// or ldaps:// . The URL can optionally specify a custom port if necessary, but by default the ldap:// will connect to port 389, and the ldaps:// will connect to port 636. Note that passwords will be in the clear if ldap:// is used, and by fall 2020 Active directory servers will no longer allow non LDAPS connections to bind to AD hosts with LDAP signing enabled. See microsoft knowledge document 935834 for more information.
Related Name
hadoop.security.group.mapping.ldap.url
Default Value
API Name
hadoop_group_mapping_ldap_url
Required
false
Hadoop User Group Mapping LDAP TLS/SSL Enabled🔗
Description
Whether or not to use TLS/SSL when connecting to the LDAP server.
Related Name
hadoop.security.group.mapping.ldap.use.ssl
Default Value
false
API Name
hadoop_group_mapping_ldap_use_ssl
Required
false
Hadoop User Group Mapping LDAP User Search Filter🔗
Description
An additional filter to use when searching for LDAP users. The default will usually be appropriate for Active Directory installations. If connecting to a generic LDAP server, ''sAMAccountName'' will likely be replaced with ''uid''. {0} is a special string used to denote where the username fits into the filter.
The domain to use for the HTTP cookie that stores the authentication token. In order for authentiation to work correctly across all Hadoop nodes' web-consoles the domain must be correctly set. Important: when using IP addresses, browsers ignore cookies with domain settings. For this setting to work properly all nodes in the cluster must be configured to generate URLs with hostname.domain names on it.
Related Name
Default Value
API Name
hadoop_http_auth_cookie_domain
Required
false
Hadoop RPC Protection🔗
Description
Quality of protection for secured RPC connections between NameNode and HDFS clients. For effective RPC protection, enable Kerberos authentication.
Related Name
hadoop.rpc.protection
Default Value
authentication
API Name
hadoop_rpc_protection
Required
false
Enable Kerberos Authentication for HTTP Web-Consoles🔗
Description
Enables Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Note: This is effective only if Kerberos is enabled for the HDFS service.
Related Name
Default Value
false
API Name
hadoop_secure_web_ui
Required
false
Hadoop Secure Authentication🔗
Description
Choose the authentication mechanism used by Hadoop
Related Name
hadoop.security.authentication
Default Value
simple
API Name
hadoop_security_authentication
Required
false
Hadoop Secure Authorization🔗
Description
Enable authorization
Related Name
hadoop.security.authorization
Default Value
false
API Name
hadoop_security_authorization
Required
false
Hadoop User Group Mapping Implementation🔗
Description
Class for user to group mapping (get groups for a given user).
The length (bits) of keys we want the KeyProvider to produce. Key length defines the upper-bound on an algorithm's security, ideally, it would coincide with the lower-bound on an algorithm's security.
Related Name
hadoop.security.key.default.bitlength
Default Value
128
API Name
hdfs_encryption_key_length
Required
false
Hadoop TLS/SSL Enabled🔗
Description
Enable TLS/SSL encryption for HDFS, MapReduce, and YARN web UIs, as well as encrypted shuffle for MapReduce and YARN.
Related Name
hadoop.ssl.enabled
Default Value
false
API Name
hdfs_hadoop_ssl_enabled
Required
false
HDFS User to Impersonate🔗
Description
The user the management services impersonates when connecting to HDFS. If no value is specified, the HDFS superuser is used.
Related Name
Default Value
API Name
hdfs_user_to_impersonate
Required
false
Hue's Kerberos Principal Short Name🔗
Description
The short name of the Hue Kerberos principal. Normally, you do not need to specify this configuration. Cloudera Manager auto-configures this property so that Hue and Cloudera Manamgent Service work properly.
Related Name
hue.kerberos.principal.shortname
Default Value
API Name
hue_kerberos_principal_shortname
Required
false
Kerberos Principal🔗
Description
Kerberos principal short name used by all roles of this service.
Related Name
Default Value
hdfs
API Name
kerberos_princ_name
Required
true
Ranger DFS Audit Path🔗
Description
The DFS path on which Ranger audits are written. The special placeholder '${ranger_base_audit_url}' should be used as the prefix, in order to use the centralized location defined in the Ranger service.
Related Name
xasecure.audit.destination.hdfs.dir
Default Value
$ranger_base_audit_url/hdfs
API Name
ranger_audit_hdfs_dir
Required
false
Ranger Audit DFS Spool Dir🔗
Description
Spool directory for Ranger audits being written to DFS.
The directory where Ranger security policies are cached locally.
Related Name
ranger.plugin.hdfs.policy.cache.dir
Default Value
/var/lib/ranger/hdfs/policy-cache
API Name
ranger_policy_cache_dir
Required
false
Log and Query Redaction Policy🔗
Description
Note: Do not edit this property in the classic layout. Switch to the new layout to use preconfigured redaction rules and test your rules inline.Use this property to define a list of rules to be followed for redacting sensitive information from log files and query strings. Click + to add a new redaction rule. You can choose one of the preconfigured rules or add a custom rule. When specifying a custom rule, the Search field should contain a regular expression that will be matched against the data. If a match is found, it is replaced by the contents of the Replace field.Trigger is an optional field. It can be used to specify a simple string to be searched in the data. If the string is found, the redactor attempts to find a match for the Search regex. If no trigger is specified, redaction occurs by matching the Search regular expression. Use the Trigger field to enhance performance: simple string matching is faster than regular expression matching.Test your rules by entering sample text into the Test Redaction Rules text box and clicking Test Redaction. If no rules match, the text you entered is returned unchanged.
Path to the TLS/SSL client truststore file. Defines a cluster-wide default that can be overridden by individual services. This truststore must be in JKS format. The truststore contains certificates of trusted servers, or of Certificate Authorities trusted to identify servers. The contents of the truststore can be modified without restarting any roles. By default, changes to its contents are picked up within ten seconds. If not set, the default Java truststore is used to verify certificates.
Password for the TLS/SSL client truststore. Defines a cluster-wide default that can be overridden by individual services.
Related Name
ssl.client.truststore.password
Default Value
API Name
ssl_client_truststore_password
Required
false
Hadoop TLS/SSL Server Keystore Key Password🔗
Description
Password that protects the private key contained in the server keystore used for encrypted shuffle and encrypted web UIs. Applies to all configurations of daemon roles of this service.
Related Name
ssl.server.keystore.keypassword
Default Value
API Name
ssl_server_keystore_keypassword
Required
false
Hadoop TLS/SSL Server Keystore File Location🔗
Description
Path to the keystore file containing the server certificate and private key used for encrypted shuffle and encrypted web UIs. Applies to configurations of all daemon roles of this service.
Related Name
ssl.server.keystore.location
Default Value
API Name
ssl_server_keystore_location
Required
false
Hadoop TLS/SSL Server Keystore File Password🔗
Description
Password for the server keystore file used for encrypted shuffle and encrypted web UIs. Applies to configurations of all daemon roles of this service.
Related Name
ssl.server.keystore.password
Default Value
API Name
ssl_server_keystore_password
Required
false
SSL/TLS Cipher Suite🔗
Description
The SSL/TLS cipher suites to use. "Modern 2018" is a modern set of cipher suites as of 2018, according to the Mozilla server-side TLS recommendations. These cipher suites use strong cryptography and are preferred unless interaction with older clients is required. These modern cipher suites are compatible with Firefox 27, Chrome 22, Internet Explorer 11, Opera 14, Safari 7, Android 4.4, and Java 8. "Intermediate 2018" is an intermediate set of cipher suites as of 2018, according to the Mozilla server-side TLS recommendations. Select the Intermediate 2018 cipher suites if you require compatibility with a wider range of clients, legacy browsers, or older Linux tools.
Related Name
ssl.server.exclude.cipher.list
Default Value
modern2018
API Name
tls_ciphers
Required
false
Trusted Kerberos Realms🔗
Description
List of Kerberos realms that Hadoop services should trust. If empty, defaults to the default_realm property configured in the krb5.conf file. After changing this value and restarting the service, all services depending on this service must also be restarted. Adds mapping rules for each domain to the hadoop.security.auth_to_local property in core-site.xml.
Whether to suppress configuration warnings produced by the Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the Failover Controller Environment Advanced Configuration Snippet (Safety Valve) configuration validator.
Whether to suppress configuration warnings produced by the Failover Controller Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the HDFS Client Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh configuration validator.
Whether to suppress configuration warnings produced by the HttpFS Advanced Configuration Snippet (Safety Valve) for httpfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the HttpFS Advanced Configuration Snippet (Safety Valve) for core-site.xml configuration validator.
Whether to suppress configuration warnings produced by the HttpFS Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Related Name
Default Value
false
API Name
role_config_suppression_jn_config_safety_valve
Required
true
Suppress Configuration Validator: Java Configuration Options for JournalNode🔗
Description
Whether to suppress configuration warnings produced by the Java Configuration Options for JournalNode configuration validator.
Whether to suppress configuration warnings produced by the JournalNode Environment Advanced Configuration Snippet (Safety Valve) configuration validator.
Whether to suppress configuration warnings produced by the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_all_hosts.txt configuration validator.
Whether to suppress configuration warnings produced by the NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_allow.txt configuration validator.
Whether to suppress configuration warnings produced by the NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_exclude.txt configuration validator.
Suppress Configuration Validator: Validates Nameservices do not conflict between base and compute clusters.🔗
Description
Whether to suppress configuration warnings produced by the Validates Nameservices do not conflict between base and compute clusters. configuration validator.
Whether to suppress configuration warnings produced by the NFS Gateway Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the NFS Gateway Environment Advanced Configuration Snippet (Safety Valve) configuration validator.
Whether to suppress configuration warnings produced by the NameNode Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml configuration validator.
Whether to suppress configuration warnings produced by the SecondaryNameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml configuration validator.
Whether to suppress configuration warnings produced by the SecondaryNameNode Environment Advanced Configuration Snippet (Safety Valve) configuration validator.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Replication Factor parameter.
Related Name
Default Value
false
API Name
service_config_suppression_dfs_replication
Required
true
Suppress Parameter Validation: Additional Rules to Map Kerberos Principals to Short Names🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Additional Rules to Map Kerberos Principals to Short Names parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP Bind User Password🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP Bind User Password parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP Bind User Distinguished Name🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP Bind User Distinguished Name parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP Group Search Filter🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP Group Search Filter parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP Group Name Attribute🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP Group Name Attribute parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP TLS/SSL Truststore🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP TLS/SSL Truststore parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP TLS/SSL Truststore Password🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP TLS/SSL Truststore Password parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP Group Membership Attribute🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP Group Membership Attribute parameter.
Suppress Parameter Validation: Hadoop User Group Mapping LDAP User Search Filter🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop User Group Mapping LDAP User Search Filter parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Advanced Configuration Snippet (Safety Valve) for core-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Environment Advanced Configuration Snippet (Safety Valve) for hadoop-env.sh parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Replication Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Snapshot Shell Command Environment Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Advanced Configuration Snippet (Safety Valve) for ssl-client.xml parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for ssl-server.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for ssl-server.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-audit.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-audit.xml parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-policymgr-ssl.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-policymgr-ssl.xml parameter.
Suppress Parameter Validation: HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Cluster-Wide Default TLS/SSL Client Truststore Location parameter.
Whether to suppress configuration warnings produced by the built-in parameter validation for the Cluster-Wide Default TLS/SSL Client Truststore Password parameter.
Suppress Parameter Validation: Hadoop TLS/SSL Server Keystore Key Password🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop TLS/SSL Server Keystore Key Password parameter.
Suppress Parameter Validation: Hadoop TLS/SSL Server Keystore File Location🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop TLS/SSL Server Keystore File Location parameter.
Suppress Parameter Validation: Hadoop TLS/SSL Server Keystore File Password🔗
Description
Whether to suppress configuration warnings produced by the built-in parameter validation for the Hadoop TLS/SSL Server Keystore File Password parameter.
Whether to suppress the results of the Corrupt Blocks heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the HDFS Canary heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
service_health_suppression_hdfs_canary_health
Required
true
Suppress Health Test: DataNode Health🔗
Description
Whether to suppress the results of the DataNode Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Failover Controllers Health🔗
Description
Whether to suppress the results of the Failover Controllers Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the NameNode Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Whether to suppress the results of the Missing Blocks heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Related Name
Default Value
false
API Name
service_health_suppression_hdfs_missing_blocks
Required
true
Suppress Health Test: Under-Replicated Blocks🔗
Description
Whether to suppress the results of the Under-Replicated Blocks heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.
Suppress Health Test: Erasure Coding Policy Verification Test🔗
Description
Whether to suppress the results of the Erasure Coding Policy Verification Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts.