HDFS
Role groups:
Balancer Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Balancer Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | balancer_config_safety_valve | false | |||
Java Configuration Options for Balancer | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | balancer_java_opts | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Instead, use .*, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Use .* instead, alert: false, rate: 0, exceptiontype: java.io.IOException, alert: false, rate: 0, exceptiontype: java.net.SocketException, alert: false, rate: 0, exceptiontype: java.net.SocketClosedException, alert: false, rate: 0, exceptiontype: java.io.EOFException, alert: false, rate: 0, exceptiontype: java.nio.channels.CancelledKeyException, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 0, threshold:WARN, content:Unknown job [^ ]+ being deleted.*, alert: false, rate: 0, threshold:WARN, content:Error executing shell command .+ No such process.+, alert: false, rate: 0, threshold:WARN, content:.*attempt to override final parameter.+, alert: false, rate: 0, threshold:WARN, content:[^ ]+ is a deprecated filesystem name. Use.*, alert: false, rate: 1, periodminutes: 1, threshold:WARN ] | log_event_whitelist | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Rebalancing Threshold | The percentage deviation from average utilization, after which a node will be rebalanced. (for example, '10.0' for 10%) | 10.0 % | rebalancer_threshold | false | ||
Rebalancing Policy | The policy that should be used to rebalance HDFS storage. The default DataNode policy balances the storage at the DataNode level. This is similar to the balancing policy from prior releases. The BlockPool policy balances the storage at the block pool level as well as at the Datanode level. The BlockPool policy is relevant only to a Federated HDFS service. | DataNode | rebalancing_policy | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of Balancer in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 1 GiB | balancer_java_heapsize | false |
DataNode Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | datanode_config_safety_valve | false | |||
Java Configuration Options for DataNode | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | datanode_java_opts | false | ||
Available Space Policy Balanced Preference | Only used when the DataNode Volume Choosing Policy is set to Available Space. Controls what percentage of new block allocations will be sent to volumes with more available disk space than others. This setting should be in the range 0.0 - 1.0, though in practice 0.5 - 1.0, since there should be no reason to prefer that volumes with less available disk space receive more block allocations. | dfs.datanode.available-space-volume-choosing-policy.balanced-space-preference-fraction | 0.75 | dfs_datanode_available_space_balanced_preference | true | |
Available Space Policy Balanced Threshold | Only used when the DataNode Volume Choosing Policy is set to Available Space. Controls how much DataNode volumes are allowed to differ in terms of bytes of free disk space before they are considered imbalanced. If the free space of all the volumes are within this range of each other, the volumes will be considered balanced and block assignments will be done on a pure round robin basis. | dfs.datanode.available-space-volume-choosing-policy.balanced-space-threshold | 10 GiB | dfs_datanode_available_space_balanced_threshold | true | |
DataNode Volume Choosing Policy | DataNode Policy for picking which volume should get a new block. The Available Space policy is only available starting with CDH 4.3. | dfs.datanode.fsdataset.volume.choosing.policy | org.apache.hadoop.hdfs.server.datanode.fsdataset.RoundRobinVolumeChoosingPolicy | dfs_datanode_volume_choosing_policy | true | |
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) | Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. | hadoop_metrics2_safety_valve | false | |||
DataNode Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | true | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Log Directory | Directory where DataNode will place its log files. | hadoop.log.dir | /var/log/hadoop-hdfs | datanode_log_dir | false | |
DataNode Logging Threshold | The minimum log level for DataNode logs | INFO | log_threshold | false | ||
DataNode Maximum Log File Backups | The maximum number of rolled log files to keep for DataNode logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
DataNode Max Log Size | The maximum size, in megabytes, per log file for DataNode logs. Typically used by log4j. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Block Count Thresholds | The health test thresholds of the number of blocks on a DataNode | Warning: 200000.0, Critical: Never | datanode_block_count_thresholds | false | ||
DataNode Connectivity Health Test | Enables the health test that verifies the DataNode is connected to the NameNode | true | datanode_connectivity_health_enabled | false | ||
DataNode Connectivity Tolerance at Startup | The amount of time to wait for the DataNode to fully start up and connect to the NameNode before enforcing the connectivity check. | 3 minute(s) | datanode_connectivity_tolerance | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | datanode_fd_thresholds | false | ||
DataNode Free Space Monitoring Thresholds | The health test thresholds of free space in a DataNode. Specified as a percentage of the capacity on the DataNode. | Warning: 20.0 %, Critical: 10.0 % | datanode_free_space_thresholds | false | ||
Garbage Collection Duration Thresholds | The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. | Warning: 30.0, Critical: 60.0 | datanode_gc_duration_thresholds | false | ||
Garbage Collection Duration Monitoring Period | The period to review when computing the moving average of garbage collection time. | 5 minute(s) | datanode_gc_duration_window | false | ||
DataNode Host Health Test | When computing the overall DataNode health, consider the host's health. | true | datanode_host_health_enabled | false | ||
DataNode Process Health Test | Enables the health test that the DataNode's process state is consistent with the role configuration | true | datanode_scm_health_enabled | false | ||
DataNode Volume Failures Thresholds | The health test thresholds of failed volumes in a DataNode. | Warning: Never, Critical: Any | datanode_volume_failures_thresholds | false | ||
Web Metric Collection | Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. | true | datanode_web_metric_collection_enabled | false | ||
Web Metric Collection Duration | The health test thresholds on the duration of the metrics request to the web server. | Warning: 10 second(s), Critical: Never | datanode_web_metric_collection_thresholds | false | ||
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | false | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Instead, use .*, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Use .* instead, alert: false, rate: 0, exceptiontype: java.io.IOException, alert: false, rate: 0, exceptiontype: java.net.SocketException, alert: false, rate: 0, exceptiontype: java.net.SocketClosedException, alert: false, rate: 0, exceptiontype: java.io.EOFException, alert: false, rate: 0, exceptiontype: java.nio.channels.CancelledKeyException, alert: false, rate: 1, periodminutes: 5, content:Datanode registration failed, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 0, threshold:WARN, content:Got a command from standby NN - ignoring command:.*, alert: false, rate: 0, threshold:WARN, content:Unknown job [^ ]+ being deleted.*, alert: false, rate: 0, threshold:WARN, content:Error executing shell command .+ No such process.+, alert: false, rate: 0, threshold:WARN, content:.*attempt to override final parameter.+, alert: false, rate: 0, threshold:WARN, content:[^ ]+ is a deprecated filesystem name. Use.*, alert: false, rate: 1, periodminutes: 1, threshold:WARN ] | log_event_whitelist | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Data Directory | Comma-delimited list of directories on the local file system where the DataNode stores HDFS block data. Typical values are /data/N/dfs/dn for N = 1, 2, 3... These directories should be mounted using the noatime option and the disks should be configured using JBOD. RAID is not recommended. | dfs.datanode.data.dir | dfs_data_dir_list | true | ||
Reserved Space for Non DFS Use | Reserved space in bytes per volume for non Distributed File System (DFS) use. | dfs.datanode.du.reserved | 10 GiB | dfs_datanode_du_reserved | false | |
DataNode Failed Volumes Tolerated | The number of volumes that are allowed to fail before a DataNode stops offering service. By default, any volume failure will cause a DataNode to shutdown. | dfs.datanode.failed.volumes.tolerated | 0 | dfs_datanode_failed_volumes_tolerated | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Balancing Bandwidth | Maximum amount of bandwidth that each DataNode can use for balancing. Specified in bytes per second. | dfs.datanode.balance.bandwidthPerSec | 10 MiB | dfs_balance_bandwidthPerSec | false | |
Enable purging cache after reads | In some workloads, the data read from HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is delivered to the client. This may improve performance for some workloads by freeing buffer cache spare usage for more cacheable data. This behavior will always be disabled for workloads that read only short sections of a block (e.g HBase random-IO workloads). This property is supported in CDH3u3 or later deployments. | dfs.datanode.drop.cache.behind.reads | false | dfs_datanode_drop_cache_behind_reads | false | |
Enable purging cache after writes | In some workloads, the data written to HDFS is known to be significantly large enough that it is unlikely to be useful to cache it in the operating system buffer cache. In this case, the DataNode may be configured to automatically purge all data from the buffer cache after it is written to disk. This may improve performance for some workloads by freeing buffer cache spare usage for more cacheable data. This property is supported in CDH3u3 or later deployments. | dfs.datanode.drop.cache.behind.writes | false | dfs_datanode_drop_cache_behind_writes | false | |
Handler Count | The number of server threads for the DataNode. | dfs.datanode.handler.count | 3 | dfs_datanode_handler_count | false | |
Maximum Number of Transfer Threads | Specifies the maximum number of threads to use for transferring data in and out of the DataNode. | dfs.datanode.max.transfer.threads | 4096 | dfs_datanode_max_xcievers | false | |
Number of read ahead bytes | While reading block files, the DataNode can use the posix_fadvise system call to explicitly page data into the operating system buffer cache ahead of the current reader's position. This can improve performance especially when disks are highly contended. This configuration specifies the number of bytes ahead of the current read position which the DataNode will attempt to read ahead. A value of 0 disables this feature. This property is supported in CDH3u3 or later deployments. | dfs.datanode.readahead.bytes | 4 MiB | dfs_datanode_readahead_bytes | false | |
Enable immediate enqueuing of data to disk after writes | If this configuration is enabled, the DataNode will instruct the operating system to enqueue all written data to the disk immediately after it is written. This differs from the usual OS policy which may wait for up to 30 seconds before triggering writeback. This may improve performance for some workloads by smoothing the IO profile for data written to disk. This property is supported in CDH3u3 or later deployments. | dfs.datanode.sync.behind.writes | false | dfs_datanode_sync_behind_writes | false | |
Hue Thrift Server Max Threadcount | Maximum number of running threads for the Hue Thrift server running on each DataNode | dfs.thrift.threads.max | 20 | dfs_thrift_threads_max | false | |
Hue Thrift Server Min Threadcount | Minimum number of running threads for the Hue Thrift server running on each DataNode | dfs.thrift.threads.min | 10 | dfs_thrift_threads_min | false | |
Hue Thrift Server Timeout | Timeout in seconds for the Hue Thrift server running on each DataNode | dfs.thrift.timeout | 60 | dfs_thrift_timeout | false | |
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Plugins
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DateNode Plugins | Comma-separated list of DataNode plug-ins to be activated. If one plug-in cannot be loaded, all the plug-ins are ignored. | dfs.datanode.plugins | dfs_datanode_plugins_list | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Bind DataNode to Wildcard Address | If enabled, the DataNode binds to the wildcard address ("0.0.0.0") on all of its ports. | false | dfs_datanode_bind_wildcard | false | ||
DataNode HTTP Web UI Port | Port for the DataNode HTTP web UI. Combined with the DataNode's hostname to build its HTTP address. | dfs.datanode.http.address | 50075 | dfs_datanode_http_port | false | |
Secure DataNode Web UI Port (SSL) | The base port where the secure DataNode web UI listens. Combined with the DataNode's hostname to build its secure web UI address. | dfs.datanode.https.address | 50475 | dfs_datanode_https_port | false | |
DataNode Protocol Port | Port for the various DataNode Protocols. Combined with the DataNode's hostname to build its IPC port address. | dfs.datanode.ipc.address | 50020 | dfs_datanode_ipc_port | false | |
DataNode Transceiver Port | Port for DataNode's XCeiver Protocol. Combined with the DataNode's hostname to build its address. | dfs.datanode.address | 50010 | dfs_datanode_port | false | |
Use DataNode Hostname | Whether DataNodes should use DataNode hostnames when connecting to DataNodes for data transfer. This property is supported in CDH3u4 or later deployments. | dfs.datanode.use.datanode.hostname | false | dfs_datanode_use_datanode_hostname | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of DataNode in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 1 GiB | datanode_java_heapsize | false | ||
Maximum Memory Used for Caching | The maximum amount of memory a DataNode may use to cache data blocks in memory. Setting it to zero will disable caching. | dfs.datanode.max.locked.memory | 4 GiB | dfs_datanode_max_locked_memory | false | |
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Security
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Data Directory Permissions | Permissions for the directories on the local file system where the DataNode stores its blocks. The permissions must be octal. 755 and 700 are typical values. | dfs.datanode.data.dir.perm | 700 | dfs_datanode_data_dir_perm | false |
Failover Controller Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Configuration Options for Failover Controller | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | failover_controller_java_opts | false | |||
Failover Controller Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | fc_config_safety_valve | false | |||
Failover Controller Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Failover Controller Log Directory | Directory where Failover Controller will place its log files. | /var/log/hadoop-hdfs | failover_controller_log_dir | false | ||
Failover Controller Logging Threshold | The minimum log level for Failover Controller logs | INFO | log_threshold | false | ||
Failover Controller Maximum Log File Backups | The maximum number of rolled log files to keep for Failover Controller logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
Failover Controller Max Log Size | The maximum size, in megabytes, per log file for Failover Controller logs. Typically used by log4j. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | failovercontroller_fd_thresholds | false | ||
Failover Controller Host Health Test | When computing the overall Failover Controller health, consider the host's health. | true | failovercontroller_host_health_enabled | false | ||
Failover Controller Process Health Test | Enables the health test that the Failover Controller's process state is consistent with the role configuration | true | failovercontroller_scm_health_enabled | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 1, periodminutes: 1, threshold:WARN ] | log_event_whitelist | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of Failover Controller in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 256 MiB | failover_controller_java_heapsize | false | ||
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Gateway Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Client Java Configuration Options | These are Java command line arguments. Commonly, garbage collection flags or extra debugging flags would be passed here. | -Djava.net.preferIPv4Stack=true | hbase_client_java_opts | false | ||
HDFS Client Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into the client configuration for hdfs-site.xml. | hdfs_client_config_safety_valve | false | |||
HDFS Client Environment Advanced Configuration Snippet for hadoop-env.sh (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into the client configuration for hadoop-env.sh. | hdfs_client_env_safety_valve | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Alternatives Priority | The priority level that the client configuration will have in the Alternatives system on the hosts. Higher priority levels will cause Alternatives to prefer this configuration over any others. | 90 | client_config_priority | true | ||
Use Trash | Move deleted files to the trash so that they can be recovered if necessary. This client side configuration takes effect only if the HDFS service-wide trash is disabled (NameNode Filesystem Trash Interval set to 0) and is ignored otherwise. The trash is not automatically emptied when enabled with this configuration. | false | dfs_client_use_trash | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable HDFS Short Circuit Read | Enable HDFS short circuit read. This allows a client co-located with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality. | dfs.client.read.shortcircuit | false | dfs_client_read_shortcircuit | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Client Java Heap Size in Bytes | Maximum size for the Java process heap memory. Passed to Java -Xmx. Measured in bytes. | 256 MiB | hdfs_client_java_heapsize | false |
HttpFS Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
HttpFS Advanced Configuration Snippet (Safety Valve) for httpfs-site.xml | For advanced use only, a string to be inserted into httpfs-site.xml for this role only. | httpfs_config_safety_valve | false | |||
Java Configuration Options for HttpFS | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | httpfs_java_opts | false | |||
System Group | The group that the HttpFS server process should run as. | httpfs | httpfs_process_groupname | true | ||
System User | The user that the HttpFS server process should run as. | httpfs | httpfs_process_username | true | ||
HttpFS Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
HttpFS Log Directory | Directory where HttpFS will place its log files. | hadoop.log.dir | /var/log/hadoop-httpfs | httpfs_log_dir | false | |
HttpFS Logging Threshold | The minimum log level for HttpFS logs | INFO | log_threshold | false | ||
HttpFS Maximum Log File Backups | The maximum number of rolled log files to keep for HttpFS logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
HttpFS Max Log Size | The maximum size, in megabytes, per log file for HttpFS logs. Typically used by log4j. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | httpfs_fd_thresholds | false | ||
HttpFS Host Health Test | When computing the overall HttpFS health, consider the host's health. | true | httpfs_host_health_enabled | false | ||
HttpFS Process Health Test | Enables the health test that the HttpFS's process state is consistent with the role configuration | true | httpfs_scm_health_enabled | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Administration Port | The port for the administration interface. | hdfs.httpfs.admin.port | 14001 | hdfs_httpfs_admin_port | false | |
HTTP Port | The HTTP port where the REST interface to HDFS is available. | hdfs.httpfs.http.port | 14000 | hdfs_httpfs_http_port | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of HttpFS in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 256 MiB | httpfs_java_heapsize | false | ||
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
Security
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Signature Secret | The secret to use for signing client authentication tokens. | hdfs.httpfs.signature.secret | ****** | hdfs_httpfs_signature_secret | true |
JournalNode Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
JournalNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | jn_config_safety_valve | false | |||
Java Configuration Options for JournalNode | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | journalNode_java_opts | false | |||
JournalNode Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | true | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
JournalNode Log Directory | Directory where JournalNode will place its log files. | /var/log/hadoop-hdfs | journalnode_log_dir | false | ||
JournalNode Logging Threshold | The minimum log level for JournalNode logs | INFO | log_threshold | false | ||
JournalNode Maximum Log File Backups | The maximum number of rolled log files to keep for JournalNode logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
JournalNode Max Log Size | The maximum size, in megabytes, per log file for JournalNode logs. Typically used by log4j. | 200 MiB | max_log_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Edits Directory Free Space Monitoring Absolute Thresholds | The health check thresholds for monitoring of free space on the filesystem that contains the JournalNode's edits directory. | Warning: 10 GiB, Critical: 5 GiB | journalnode_edits_directory_free_space_absolute_thresholds | false | ||
Edits Directory Free Space Monitoring Percentage Thresholds | The health check thresholds for monitoring of free space on the filesystem that contains the JournalNode's edits directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Edits Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | journalnode_edits_directory_free_space_percentage_thresholds | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | journalnode_fd_thresholds | false | ||
Garbage Collection Duration Thresholds | The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. | Warning: 30.0, Critical: 60.0 | journalnode_gc_duration_thresholds | false | ||
Garbage Collection Duration Monitoring Period | The period to review when computing the moving average of garbage collection time. | 5 minute(s) | journalnode_gc_duration_window | false | ||
JournalNode Host Health Test | When computing the overall JournalNode health, consider the host's health. | true | journalnode_host_health_enabled | false | ||
JournalNode Process Health Test | Enables the health test that the JournalNode's process state is consistent with the role configuration | true | journalnode_scm_health_enabled | false | ||
Active NameNode Sync Status Health Check | Enables the health check that verifies the active NameNode's sync status to the JournalNode | true | journalnode_sync_status_enabled | false | ||
Active NameNode Sync Status Startup Tolerance | The amount of time at JournalNode startup allowed for the active NameNode to get in sync with the JournalNode. | 3 minute(s) | journalnode_sync_status_startup_tolerance | false | ||
Web Metric Collection | Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. | true | journalnode_web_metric_collection_enabled | false | ||
Web Metric Collection Duration | The health test thresholds on the duration of the metrics request to the web server. | Warning: 10 second(s), Critical: Never | journalnode_web_metric_collection_thresholds | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 1, periodminutes: 1, threshold:WARN ] | log_event_whitelist | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
JournalNode Edits Directory | Directory on the local file system where the NameNode's edits are written. | dfs.journalnode.edits.dir | dfs_journalnode_edits_dir | true |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
JournalNode HTTP Port | Port for the JournalNode's HTTP web UI. Combined with the JournalNode's hostname to build its HTTP address. | dfs.journalnode.http-address | 8480 | dfs_journalnode_http_port | false | |
JournalNode RPC Port | Port for the JournalNode's RPC. Combined with the JournalNode's hostname to build its RPC address. | dfs.journalnode.rpc-address | 8485 | dfs_journalnode_rpc_port | false | |
Bind JournalNode to Wildcard Address | If enabled, the JournalNode binds to the wildcard address ("0.0.0.0") on all of its ports. | false | journalnode_bind_wildcard | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of JournalNode in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 256 MiB | journalNode_java_heapsize | false | ||
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
NFS Gateway Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NFS Gateway Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
NFS Gateway Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | nfsgateway_config_safety_valve | false | |||
Java Configuration Options for NFS Gateway | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | nfsgateway_java_opts | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NFS Gateway Logging Threshold | The minimum log level for NFS Gateway logs | INFO | log_threshold | false | ||
NFS Gateway Maximum Log File Backups | The maximum number of rolled log files to keep for NFS Gateway logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
NFS Gateway Max Log Size | The maximum size, in megabytes, per log file for NFS Gateway logs. Typically used by log4j. | 200 MiB | max_log_size | false | ||
NFS Gateway Log Directory | Directory where NFS Gateway will place its log files. | hadoop.log.dir | /var/log/hadoop-hdfs | nfsgateway_log_dir | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 1, periodminutes: 1, threshold:WARN ] | log_event_whitelist | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | nfsgateway_fd_thresholds | false | ||
NFS Gateway Host Health Test | When computing the overall NFS Gateway health, consider the host's health. | true | nfsgateway_host_health_enabled | false | ||
NFS Gateway Process Health Test | Enables the health test that the NFS Gateway's process state is consistent with the role configuration | true | nfsgateway_scm_health_enabled | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Temporary Dump Directory | NFS clients often reorder writes. As a result, sequential writes can arrive at the NFS Gateway in random order. This directory is used to temporarily save out-of-order writes before writing to HDFS. For each file, the out-of-order writes are dumped after they are accumulated to exceed certain threshold (e.g., 1MB) in memory. Please make sure this directory has enough space. For example, if the application uploads 10 files with each having 100MB, it is recommended that this directory have roughly 1GB of space in case write reorder happens (in the worst case) to every file. | dfs.nfs3.dump.dir | /tmp/.hdfs-nfs | dfs_nfs3_dump_dir | false | |
Allowed Hosts and Privileges | By default, NFS Gateway exported directories can be mounted by any client. For better access control, update this property with a list of host names and access privileges separated by whitespace characters. Host name format can be a single host, a Java regular expression, or an IPv4 address. The access privilege uses rw to specify readwrite and ro to specify readonly access. If the access privilege is not provided, the default is read-only. Examples of host name format and access privilege: "192.168.0.0/22 rw", "host.*.example.com", "host1.test.org ro". | dfs.nfs.exports.allowed.hosts | * rw | dfs_nfs_exports_allowed_hosts | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NFS Gateway MountD Port | The port number of the mount daemon implemented inside the NFS Gateway server role. | nfs3.mountd.port | 4242 | nfs3_mountd_port | false | |
Portmap (or Rpcbind) Port | The port number of the system portmap or rpcbind service. This configuration is used by Cloudera Manager to verify if the system portmap or rpcbind service is running before starting NFS Gateway role. Cloudera Manager does not manage the system portmap or rpcbind service. | 111 | nfs3_portmap_port | false | ||
NFS Gateway Server Port | The NFS Gateway server port. | nfs3.server.port | 2049 | nfs3_server_port | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of NFS Gateway in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 256 MiB | nfsgateway_java_heapsize | false | ||
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
NameNode Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Automatic Failover | Enable Automatic Failover to maintain High Availability. Requires a ZooKeeper service and a High Availability NameNode partner. | dfs.ha.automatic-failover.enabled | false | autofailover_enabled | false | |
NameNode Nameservice | Nameservice of this NameNode. The Nameservice represents the interface to this NameNode and its High Availability partner. The Nameservice also represents the namespace associated with a federated NameNode. | dfs_federation_namenode_nameservice | false | |||
Invalidate Work Percentage Per Iteration | This determines the percentage amount of block invalidations (deletes) to do over a single DataNode heartbeat deletion command. The final deletion count is determined by applying this percentage to the number of live nodes in the system. The resultant number is the number of blocks from the deletion list chosen for proper invalidation over a single heartbeat of a single DataNode. | dfs.namenode.invalidate.work.pct.per.iteration | 0.32 | dfs_namenode_invalidate_work_pct_per_iteration | false | |
Quorum-based Storage Journal name | Name of the journal located on each JournalNodes' filesystem. | dfs_namenode_quorum_journal_name | false | |||
Replication Work Multiplier Per Iteration | This determines the total amount of block transfers to begin in parallel at a DataNode for replication, when such a command list is being sent over a DataNode heartbeat by the NameNode. The actual number is obtained by multiplying this value by the total number of live nodes in the cluster. The result number is the number of blocks to transfer immediately, per DataNode heartbeat. | dfs.namenode.replication.work.multiplier.per.iteration | 2 | dfs_namenode_replication_work_multiplier_per_iteration | false | |
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) | Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. | hadoop_metrics2_safety_valve | false | |||
NameNode Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
NameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | namenode_config_safety_valve | false | |||
NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_allow.txt | For advanced use only, a string to be inserted into dfs_hosts_allow.txt for this role only. | namenode_hosts_allow_safety_valve | false | |||
NameNode Advanced Configuration Snippet (Safety Valve) for dfs_hosts_exclude.txt | For advanced use only, a string to be inserted into dfs_hosts_exclude.txt for this role only. | namenode_hosts_exclude_safety_valve | false | |||
Java Configuration Options for NameNode | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | namenode_java_opts | false | ||
Mountpoints | Mountpoints that are mapped to this NameNode's Nameservice. | / | nameservice_mountpoints | false | ||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true |
Checkpointing
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Filesystem Checkpoint Period | The time between two periodic file system checkpoints. | dfs.namenode.checkpoint.period | 1 hour(s) | fs_checkpoint_period | false | |
Filesystem Checkpoint Transaction Threshold | The number of transactions after which the NameNode or SecondaryNameNode will create a checkpoint of the namespace, regardless of whether the checkpoint period has expired. | dfs.namenode.checkpoint.txns | 1000000 | fs_checkpoint_txns | false |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NameNode Logging Threshold | The minimum log level for NameNode logs | INFO | log_threshold | false | ||
NameNode Maximum Log File Backups | The maximum number of rolled log files to keep for NameNode logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
NameNode Max Log Size | The maximum size, in megabytes, per log file for NameNode logs. Typically used by log4j. | 200 MiB | max_log_size | false | ||
NameNode Log Directory | Directory where NameNode will place its log files. | hadoop.log.dir | /var/log/hadoop-hdfs | namenode_log_dir | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Instead, use .*, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Use .* instead, alert: false, rate: 0, exceptiontype: java.io.IOException, alert: false, rate: 0, exceptiontype: java.net.SocketException, alert: false, rate: 0, exceptiontype: java.net.SocketClosedException, alert: false, rate: 0, exceptiontype: java.io.EOFException, alert: false, rate: 0, exceptiontype: java.nio.channels.CancelledKeyException, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 0, threshold:WARN, content:Unknown job [^ ]+ being deleted.*, alert: false, rate: 0, threshold:WARN, content:Error executing shell command .+ No such process.+, alert: false, rate: 0, threshold:WARN, content:.*attempt to override final parameter.+, alert: false, rate: 0, threshold:WARN, content:[^ ]+ is a deprecated filesystem name. Use.*, alert: false, rate: 1, periodminutes: 1, threshold:WARN, alert: false, rate: 1, threshold:INFO, content:Triggering checkpoint.* ] | log_event_whitelist | false | ||
Filesystem Checkpoint Age Monitoring Thresholds | The health test thresholds of the age of the HDFS namespace checkpoint. Specified as a percentage of the configured checkpoint interval. | Warning: 200.0 %, Critical: 400.0 % | namenode_checkpoint_age_thresholds | false | ||
Filesystem Checkpoint Transactions Monitoring Thresholds | The health test thresholds of the number of transactions since the last HDFS namespace checkpoint. Specified as a percentage of the configured checkpointing transaction limit. | Warning: 200.0 %, Critical: 400.0 % | namenode_checkpoint_transactions_thresholds | false | ||
Data Directories Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystems that contain this role's data directories. | Warning: 10 GiB, Critical: 5 GiB | namenode_data_directories_free_space_absolute_thresholds | false | ||
Data Directories Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystems that contain this role's data directories. Specified as a percentage of the capacity on the filesystem. This setting is not used if a Data Directories Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | namenode_data_directories_free_space_percentage_thresholds | false | ||
NameNode Directory Failures Thresholds | The health test thresholds of failed status directories in a NameNode. | Warning: Never, Critical: Any | namenode_directory_failures_thresholds | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | namenode_fd_thresholds | false | ||
Garbage Collection Duration Thresholds | The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. | Warning: 30.0, Critical: 60.0 | namenode_gc_duration_thresholds | false | ||
Garbage Collection Duration Monitoring Period | The period to review when computing the moving average of garbage collection time. | 5 minute(s) | namenode_gc_duration_window | false | ||
NameNode Host Health Test | When computing the overall NameNode health, consider the host's health. | true | namenode_host_health_enabled | false | ||
NameNode Out-Of-Sync JournalNodes Thresholds | The health check thresholds for the number of out-of-sync JournalNodes for this NameNode. | Warning: Never, Critical: Any | namenode_out_of_sync_journal_nodes_thresholds | false | ||
NameNode RPC Latency Thresholds | The health check thresholds of the NameNode's RPC latency. | Warning: 1 second(s), Critical: 5 second(s) | namenode_rpc_latency_thresholds | false | ||
NameNode RPC Latency Monitoring Window | The period to review when computing the moving average of the NameNode's RPC latency. | 5 minute(s) | namenode_rpc_latency_window | false | ||
NameNode Safemode Health Test | Enables the health test that the NameNode is not in safemode | true | namenode_safe_mode_enabled | false | ||
NameNode Process Health Test | Enables the health test that the NameNode's process state is consistent with the role configuration | true | namenode_scm_health_enabled | false | ||
Health Check Startup Tolerance | The amount of time allowed after this role is started that failures of health checks that rely on communication with this role will be tolerated. | 5 minute(s) | namenode_startup_tolerance | false | ||
HDFS Upgrade Status Health Test | Enables the health test of the upgrade status of the NameNode. | true | namenode_upgrade_status_enabled | false | ||
Web Metric Collection | Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. | true | namenode_web_metric_collection_enabled | false | ||
Web Metric Collection Duration | The health test thresholds on the duration of the metrics request to the web server. | Warning: 10 second(s), Critical: Never | namenode_web_metric_collection_thresholds | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Access Time Precision | The access time for HDFS file is precise upto this value. Setting the value of 0 disables access times for HDFS. When using the NFS Gateway role, make sure this property is enabled. | dfs.access.time.precision | 1 hour(s) | dfs_access_time_precision | false | |
NameNode Data Directories | Determines where on the local file system the NameNode should store the name table (fsimage). For redundancy, enter a comma-delimited list of directories to replicate the name table in all of the directories. Typical values are /data/N/dfs/nn where N=1..3. | dfs.namenode.name.dir | dfs_name_dir_list | true | ||
Restore NameNode Directories at Checkpoint Time | If set to false and if one of the replicas of the NameNode storage fails, such as temporarily failure of NFS, this directory is not used until the NameNode restarts. If enabled, failed storage is re-checked on every checkpoint and, if it becomes valid, the NameNode will try to restore the edits and fsimage. | dfs.namenode.name.dir.restore | false | dfs_name_dir_restore | false | |
NameNode Edits Directories | Directories on the local file system to store the NameNode edits. If not set, the edits are stored in the NameNode's Data Directories. The value of this configuration is automatically generated to be the Quorum-based Storage URI if there are JournalNodes and this NameNode is not Highly Available. | dfs.namenode.edits.dir | dfs_namenode_edits_dir | false | ||
Shared Edits Directory | Directory on a shared storage device, such as a Quorum-based Storage URI or a local directory that is an NFS mount from a NAS, to store the NameNode edits. The value of this configuration is automatically generated to be the Quourm Journal URI if there are JournalNodes and this NameNode is Highly Available. | dfs.namenode.shared.edits.dir | dfs_namenode_shared_edits_dir | false | ||
Safemode Extension | Determines extension of safemode in milliseconds after the threshold level is reached. | dfs.namenode.safemode.extension | 30 second(s) | dfs_safemode_extension | false | |
Safemode Minimum DataNodes | Specifies the number of DataNodes that must be live before the name node exits safemode. Enter a value less than or equal to 0 to take the number of live DataNodes into account when deciding whether to remain in safemode during startup. Values greater than the number of DataNodes in the cluster will make safemode permanent. | dfs.safemode.min.datanodes | 0 | dfs_safemode_min_datanodes | false | |
Filesystem Trash Interval | Number of minutes between trash checkpoints. Also controls the number of minutes after which a trash checkpoint directory is deleted. To disable the trash feature, enter 0. | fs.trash.interval | 1 day(s) | fs_trash_interval | false | |
Topology Script File Name | Full path to a custom topology script on the host file system. The topology script is used to determine the rack location of nodes. If left blank, a topology script will be provided that uses your hosts' rack information, visible in the "Hosts" page. | net.topology.script.file.name | topology_script_file_name | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NameNode Handler Count | The number of server threads for the NameNode. | dfs.namenode.handler.count | 30 | dfs_namenode_handler_count | false | |
NameNode Service Handler Count | The number of server threads for the NameNode used for service calls. Only used when NameNode Service RPC Port is configured. | dfs.namenode.service.handler.count | 30 | dfs_namenode_service_handler_count | false | |
Hue Thrift Server Max Threadcount | Maximum number of running threads for the Hue Thrift server running on the NameNode | dfs.thrift.threads.max | 20 | dfs_thrift_threads_max | false | |
Hue Thrift Server Min Threadcount | Minimum number of running threads for the Hue Thrift server running on the NameNode | dfs.thrift.threads.min | 10 | dfs_thrift_threads_min | false | |
Hue Thrift Server Timeout | Timeout in seconds for the Hue Thrift server running on the NameNode | dfs.thrift.timeout | 60 | dfs_thrift_timeout | false | |
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Plugins
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NameNode Plugins | Comma-separated list of NameNode plug-ins to be activated. If one plug-in cannot be loaded, all the plug-ins are ignored. | dfs.namenode.plugins | dfs_namenode_plugins_list | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
NameNode Web UI Port | The base port where the DFS NameNode web UI listens. If the port number is 0, then the server starts on a free port. Combined with the NameNode's hostname to build its HTTP address. | dfs.namenode.http-address | 50070 | dfs_http_port | false | |
Secure NameNode Web UI Port (SSL) | The base port where the secure NameNode web UI listens. | dfs.https.port | 50470 | dfs_https_port | false | |
NameNode Service RPC Port | Optional port for the service-rpc address which can be used by HDFS daemons instead of sharing the RPC address used by the clients. | dfs.namenode.servicerpc-address | dfs_namenode_servicerpc_address | false | ||
Bind NameNode to Wildcard Address | If enabled, the NameNode binds to the wildcard address ("0.0.0.0") on all of its ports. | false | namenode_bind_wildcard | false | ||
NameNode Port | The port where the NameNode runs the HDFS protocol. Combined with the NameNode's hostname to build its address. | fs.defaultFS | 8020 | namenode_port | false |
Replication
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Safemode Threshold Percentage | Specifies the percentage of blocks that should satisfy the minimal replication requirement defined by dfs.replication.min. Enter a value less than or equal to 0 to wait for any particular percentage of blocks before exiting safemode. Values greater than 1 will make safemode permanent. | dfs.namenode.safemode.threshold-pct | 0.999 | dfs_safemode_threshold_pct | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Java Heap Size of Namenode in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 1 GiB | namenode_java_heapsize | false | ||
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true |
SecondaryNameNode Default Group
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
SecondaryNameNode Nameservice | Nameservice of this SecondaryNameNode | dfs_secondarynamenode_nameservice | false | |||
Hadoop Metrics2 Advanced Configuration Snippet (Safety Valve) | Advanced Configuration Snippet (Safety Valve) for Hadoop Metrics2. Properties will be inserted into hadoop-metrics2.properties. | hadoop_metrics2_safety_valve | false | |||
SecondaryNameNode Logging Advanced Configuration Snippet (Safety Valve) | For advanced use only, a string to be inserted into log4j.properties for this role only. | log4j_safety_valve | false | |||
Heap Dump Directory | Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it doesn't exist. However, if this directory already exists, role user must have write access to this directory. If this directory is shared amongst multiple roles, it should have 1777 permissions. Note that the heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. | /tmp | oom_heap_dump_dir | false | ||
Dump Heap When Out of Memory | When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. | false | oom_heap_dump_enabled | true | ||
Kill When Out of Memory | When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. | true | oom_sigkill_enabled | true | ||
Automatically Restart Process | When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. | false | process_auto_restart | true | ||
SecondaryNameNode Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml for this role only. | secondarynamenode_config_safety_valve | false | |||
Java Configuration Options for Secondary NameNode | These arguments will be passed as part of the Java command line. Commonly, garbage collection flags or extra debugging flags would be passed here. | -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled | secondarynamenode_java_opts | false |
Checkpointing
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Filesystem Checkpoint Period | The time between two periodic file system checkpoints. | dfs.namenode.checkpoint.period | 1 hour(s) | fs_checkpoint_period | false | |
Filesystem Checkpoint Transaction Threshold | The number of transactions after which the NameNode or SecondaryNameNode will create a checkpoint of the namespace, regardless of whether the checkpoint period has expired. | dfs.namenode.checkpoint.txns | 1000000 | fs_checkpoint_txns | false |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
SecondaryNameNode Logging Threshold | The minimum log level for SecondaryNameNode logs | INFO | log_threshold | false | ||
SecondaryNameNode Maximum Log File Backups | The maximum number of rolled log files to keep for SecondaryNameNode logs. Typically used by log4j. | 10 | max_log_backup_index | false | ||
SecondaryNameNode Max Log Size | The maximum size, in megabytes, per log file for SecondaryNameNode logs. Typically used by log4j. | 200 MiB | max_log_size | false | ||
SecondaryNameNode Log Directory | Directory where SecondaryNameNode will place its log files. | hadoop.log.dir | /var/log/hadoop-hdfs | secondarynamenode_log_dir | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Health Alerts for this Role | When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Log Directory Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. | Warning: 10 GiB, Critical: 5 GiB | log_directory_free_space_absolute_thresholds | false | ||
Log Directory Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | log_directory_free_space_percentage_thresholds | false | ||
Rules to Extract Events from Log Files | This file contains the rules which govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. Each rule has some or all of the following fields:
|
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold:FATAL, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Instead, use .*, alert: false, rate: 0, threshold:WARN, content: .* is deprecated. Use .* instead, alert: false, rate: 0, exceptiontype: java.io.IOException, alert: false, rate: 0, exceptiontype: java.net.SocketException, alert: false, rate: 0, exceptiontype: java.net.SocketClosedException, alert: false, rate: 0, exceptiontype: java.io.EOFException, alert: false, rate: 0, exceptiontype: java.nio.channels.CancelledKeyException, alert: false, rate: 1, periodminutes: 2, exceptiontype: .*, alert: false, rate: 0, threshold:WARN, content:Unknown job [^ ]+ being deleted.*, alert: false, rate: 0, threshold:WARN, content:Error executing shell command .+ No such process.+, alert: false, rate: 0, threshold:WARN, content:.*attempt to override final parameter.+, alert: false, rate: 0, threshold:WARN, content:[^ ]+ is a deprecated filesystem name. Use.*, alert: false, rate: 1, periodminutes: 1, threshold:WARN, alert: false, rate: 1, threshold:INFO, content:Triggering checkpoint.* ] | log_event_whitelist | false | ||
Role Triggers | The configured triggers for this role. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:red", "streamThreshold": 0}, "enabled": "true"]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | role_triggers | true | ||
Checkpoint Directories Free Space Monitoring Absolute Thresholds | The health test thresholds for monitoring of free space on the filesystems that contain this role's checkpoint directories. | Warning: 10 GiB, Critical: 5 GiB | secondarynamenode_checkpoint_directories_free_space_absolute_thresholds | false | ||
Checkpoint Directories Free Space Monitoring Percentage Thresholds | The health test thresholds for monitoring of free space on the filesystems that contain this role's checkpoint directories. Specified as a percentage of the capacity on the filesystem. This setting is not used if a Checkpoint Directories Free Space Monitoring Absolute Thresholds setting is configured. | Warning: Never, Critical: Never | secondarynamenode_checkpoint_directories_free_space_percentage_thresholds | false | ||
File Descriptor Monitoring Thresholds | The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. | Warning: 50.0 %, Critical: 70.0 % | secondarynamenode_fd_thresholds | false | ||
Garbage Collection Duration Thresholds | The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. | Warning: 30.0, Critical: 60.0 | secondarynamenode_gc_duration_thresholds | false | ||
Garbage Collection Duration Monitoring Period | The period to review when computing the moving average of garbage collection time. | 5 minute(s) | secondarynamenode_gc_duration_window | false | ||
SecondaryNameNode Host Health Test | When computing the overall SecondaryNameNode health, consider the host's health. | true | secondarynamenode_host_health_enabled | false | ||
SecondaryNameNode Process Health Test | Enables the health test that the SecondaryNameNode's process state is consistent with the role configuration | true | secondarynamenode_scm_health_enabled | false | ||
Web Metric Collection | Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. | true | secondarynamenode_web_metric_collection_enabled | false | ||
Web Metric Collection Duration | The health test thresholds on the duration of the metrics request to the web server. | Warning: 10 second(s), Critical: Never | secondarynamenode_web_metric_collection_thresholds | false | ||
Unexpected Exits Thresholds | The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. | Warning: Never, Critical: Any | unexpected_exits_thresholds | false | ||
Unexpected Exits Monitoring Period | The period to review when computing unexpected exits. | 5 minute(s) | unexpected_exits_window | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
HDFS Checkpoint Directory | Determines where on the local file system the DFS SecondaryNameNode should store the temporary images to merge. For redundancy, enter a comma-delimited list of directories to replicate the image in all of the directories. Typical values are /data/N/dfs/snn for N = 1, 2, 3... | dfs.namenode.checkpoint.dir | fs_checkpoint_dir_list | true |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Maximum Process File Descriptors | If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. | rlimit_fds | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
SecondaryNameNode Web UI Port | The SecondaryNameNode HTTP port. If the port is 0, then the server starts on a free port. Combined with the SecondaryNameNode's hostname to build its HTTP address. | dfs.namenode.secondary.http-address | 50090 | dfs_secondary_http_port | false | |
Secure SecondaryNameNode Web UI Port (SSL) | The base port where the secure SecondaryNameNode web UI listens. | dfs.secondary.https.port | 50495 | dfs_secondary_https_port | false | |
Bind SecondaryNameNode to Wildcard Address | If enabled, the SecondaryNameNode binds to the wildcard address ("0.0.0.0") on all of its ports. | false | secondary_namenode_bind_wildcard | false |
Resource Management
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Cgroup CPU Shares | Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. | cpu.shares | 1024 | rm_cpu_shares | true | |
Cgroup I/O Weight | Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. | blkio.weight | 500 | rm_io_weight | true | |
Cgroup Memory Hard Limit | Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.limit_in_bytes | -1 MiB | rm_memory_hard_limit | true | |
Cgroup Memory Soft Limit | Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. | memory.soft_limit_in_bytes | -1 MiB | rm_memory_soft_limit | true | |
Java Heap Size of Secondary namenode in Bytes | Maximum size for the Java Process heap memory. Passed to Java -Xmx. Measured in bytes. | 1 GiB | secondary_namenode_java_heapsize | false |
Service-Wide
Advanced
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml | For advanced use only, a string to be inserted into core-site.xml. Applies to all roles and client configurations in this HDFS service as well as all its dependent services. Any configs added here will be overridden by their default values in HDFS (which can be found in hdfs-default.xml). | core_site_safety_valve | false | |||
Enable HDFS Block Metadata API | Enables DataNode support for the experimental DistributedFileSystem.getFileVBlockStorageLocations API. Applicable to CDH 4.1 and onwards. | dfs.datanode.hdfs-blocks-metadata.enabled | true | dfs_datanode_hdfs_blocks_metadata_enabled | false | |
HDFS Service Advanced Configuration Snippet (Safety Valve) for hadoop-policy.xml | For advanced use only, a string to be inserted into hadoop-policy.xml. Applies to configurations of all roles in this service except client configuration. | hadoop_policy_config_safety_valve | false | |||
Shared Hadoop Group Name | The name of the system group shared by all the core Hadoop services. | hadoop | hdfs_hadoop_group_name | true | ||
HDFS Replication Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into the environment of HDFS replication jobs. | hdfs_replication_env_safety_valve | false | |||
HDFS Service Advanced Configuration Snippet (Safety Valve) for hdfs-site.xml | For advanced use only, a string to be inserted into hdfs-site.xml. Applies to configurations of all roles in this service except client configuration. | hdfs_service_config_safety_valve | false | |||
HDFS Service Environment Advanced Configuration Snippet (Safety Valve) | For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration. | hdfs_service_env_safety_valve | false | |||
System User's Home Directory | The home directory of the system user on the local filesystem. This setting must reflect the system's configured value - only changing it here will not change the actual home directory. | /var/lib/hadoop-hdfs | hdfs_user_home_dir | true | ||
HDFS Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties | For advanced use only, a string to be inserted into the client configuration for navigator.client.properties. | navigator_client_config_safety_valve | false | |||
System Group | The group that this service's processes should run as (except the HttpFS server, which has its own group) | hdfs | process_groupname | true | ||
System User | The user that this service's processes should run as (except the HttpFS server, which has its own user) | hdfs | process_username | true |
Cloudera Navigator
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Collection | Enable collection of audit events from the service's roles. | true | navigator_audit_enabled | false | ||
Event Filter | Event filters are defined in a JSON object like the following:
{ "defaultAction" : ("accept", "discard"), "rules" : [ { "action" : ("accept", "discard"), "fields" : [ { "name" : "fieldName", "match" : "regex" } ] } ] }A filter has a default action and a list of rules, in order of precedence. Each rule defines an action, and a list of fields to match against the audit event. A rule is "accepted" if all the listed field entries match the audit event. At that point, the action declared by the rule is taken. If no rules match the event, the default action is taken. Actions default to "accept" if not defined in the JSON object. The following is the list of fields that can be filtered for HDFS events:
|
navigator.event.filter | comment : [ Default filter for HDFS services., Discards events generated by the internal Cloudera and/or HDFS users, (hdfs, hbase, mapred and dr.who), and events that affect files in , /tmp directory. ], defaultAction : accept, rules : [ action : discard, fields : [ name : username, match : (?:cloudera-scm|hbase|hdfs|mapred|hive|dr.who)(?:/.+)? ] , action : discard, fields : [ name : src, match : /tmp(?:/.*)? ] ] | navigator_audit_event_filter | false | |
Queue Policy | Action to take when the audit event queue is full. Drop the event or shutdown the affected process. | navigator.batch.queue_policy | DROP | navigator_audit_queue_policy | false | |
Event Tracker |
Configures the rules for event tracking and coalescing. This feature is
used to define equivalency between different audit events. When
events match, according to a set of configurable parameters, only one
entry in the audit list is generated for all the matching events.
Tracking works by keeping a reference to events when they first appear,
and comparing other incoming events against the "tracked" events according
to the rules defined here.
Event trackers are defined in a JSON object like the following:
{ "timeToLive" : [integer], "fields" : [ { "type" : [string], "name" : [string] } ] }Where:
|
navigator_event_tracker | comment : [ Default event tracker for HDFS services., Defines equality by comparing username, operation and source path , of the events. ], timeToLive : 60000, fields : [ type: value, name : src , type: value, name : operation , type: username, name : username ] | navigator_event_tracker | false |
High Availability
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Timeout for Cloudera Manager Fencing Strategy | The timeout, in milliseconds, to use with the Cloudera Manager agent-based fencer. | dfs.ha.fencing.cloudera_manager.timeout_millis | 10000 | dfs_ha_fencing_cloudera_manager_timeout_millis | false | |
HDFS High Availability Fencing Methods | List of fencing methods to use for service fencing. shell(./cloudera_manager_agent_fencer.py) is a fencing mechanism designed to take advantage of the CM agent. The sshfence method uses SSH. If using custom fencers (that may talk to shared store, power units, or network switches), use the shell mechanism to invoke them. | dfs.ha.fencing.methods | shell(./cloudera_manager_agent_fencer.py) | dfs_ha_fencing_methods | false | |
Timeout for SSH Fencing Strategy | SSH connection timeout, in milliseconds, to use with the built-in sshfence fencer. | dfs.ha.fencing.ssh.connect-timeout | 30 second(s) | dfs_ha_fencing_ssh_connect_timeout | false | |
Private Keys for SSH Fencing Strategy | The SSH private key files to use with the built-in sshfence fencer. These are to be accessible to the hdfs user on the machines running the NameNodes. | dfs.ha.fencing.ssh.private-key-files | dfs_ha_fencing_ssh_private_key_files | false | ||
FailoverProxyProvider Class | Enter a FailoverProxyProvider implementation to configure two URIs to connect to during fail-over. The first configured address is tried first, and on a fail-over event the other address is tried. | dfs.client.failover.proxy.provider | org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider | dfs_ha_proxy_provider | true |
Logs
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Audit Log Directory | Path to the directory where audit logs will be written. The directory will be created if it doesn't exist. | audit_event_log_dir | /var/log/hadoop-hdfs/audit | audit_event_log_dir | false | |
Number of Audit Logs to Retain | Maximum number of rolled over audit logs to retain. The logs will not be deleted if they contain audit events that have not yet been propagated to Audit Server. | navigator.audit_log_max_backup_index | 10 | navigator_audit_log_max_backup_index | false | |
Maximum Audit Log File Size | Maximum size of audit log file in MB before it is rolled over. | navigator.audit_log_max_file_size | 100 MiB | navigator_audit_log_max_file_size | false |
Monitoring
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Log Event Capture | When set, each role will identify important log events and forward them to Cloudera Manager. | true | catch_events | false | ||
Enable Service Level Health Alerts | When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold | true | enable_alerts | false | ||
Enable Configuration Change Alerts | When set, Cloudera Manager will send alerts when this entity's configuration changes. | false | enable_config_alerts | false | ||
Failover Controllers Healthy | Enables the health check that verifies that the failover controllers associated with this service are healthy and running. | true | failover_controllers_healthy_enabled | false | ||
HDFS Health Canary Directory | The service monitor will use this directory to create files to test if the hdfs service is healthy. The directory and files are created with permissions specified by 'HDFS Health Canary Directory Permissions' | /tmp/.cloudera_health_monitoring_canary_files | firehose_hdfs_canary_directory | false | ||
HDFS Health Canary Directory Permissions | The service monitor will use these permissions to create the directory and files to test if the hdfs service is healthy. Permissions are specified using the 10-character unix-symbolic format e.g. '-rwxr-xr-x'. | -rwxrwxrwx | firehose_hdfs_canary_directory_permissions | false | ||
Active NameNode Detection Window | The tolerance window that will be used in HDFS service tests that depend on detection of the active NameNode. | 3 minute(s) | hdfs_active_namenode_detecton_window | false | ||
Blocks With Corrupt Replicas Monitoring Thresholds | The health check thresholds of the number of blocks that have at least one corrupt replica. Specified as a percentage of the total number of blocks. | Warning: 0.5 %, Critical: 1.0 % | hdfs_blocks_with_corrupt_replicas_thresholds | false | ||
HDFS Canary Health Check | Enables the health check that a client can create, read, write, and delete files | true | hdfs_canary_health_enabled | false | ||
Healthy DataNode Monitoring Thresholds | The health test thresholds of the overall DataNode health. The check returns "Concerning" health if the percentage of "Healthy" DataNodes falls below the warning threshold. The check is unhealthy if the total percentage of "Healthy" and "Concerning" DataNodes falls below the critical threshold. | Warning: 95.0 %, Critical: 90.0 % | hdfs_datanodes_healthy_thresholds | false | ||
HDFS Free Space Monitoring Thresholds | The health check thresholds of free space in HDFS. Specified as a percentage of total HDFS capacity. | Warning: 20.0 %, Critical: 10.0 % | hdfs_free_space_thresholds | false | ||
Missing Block Monitoring Thresholds | The health check thresholds of the number of missing blocks. Specified as a percentage of the total number of blocks. | Warning: Never, Critical: Any | hdfs_missing_blocks_thresholds | false | ||
NameNode Activation Startup Tolerance | The amount of time after NameNode(s) start that the lack of an active NameNode will be tolerated. This is intended to allow either the auto-failover daemon to make a NameNode active, or a specifically issued failover command to take effect. This is an advanced option that does not often need to be changed. | 3 minute(s) | hdfs_namenode_activation_startup_tolerance | false | ||
Active NameNode Role Health Check | When computing the overall HDFS cluster health, consider the active NameNode's health | true | hdfs_namenode_health_enabled | false | ||
Standby NameNode Health Check | When computing the overall HDFS cluster health, consider the health of the standby NameNode. | true | hdfs_standby_namenodes_health_enabled | false | ||
Under-replicated Block Monitoring Thresholds | The health check thresholds of the number of under-replicated blocks. Specified as a percentage of the total number of blocks. | Warning: 10.0 %, Critical: 40.0 % | hdfs_under_replicated_blocks_thresholds | false | ||
Log Event Retry Frequency | The frequency in which the log4j event publication appender will retry sending undelivered log events to the Event server, in seconds | 30 | log_event_retry_frequency | false | ||
Service Triggers | The configured triggers for this service. This is a JSON formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has all of the following fields:
[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:red", "streamThreshold": 10, "enabled": "true"}]Consult the trigger rules documentation for more details on how to write triggers using tsquery. The JSON format is evolving and may change in the future and as a result backward compatibility is not guaranteed between releases at this time. |
[] | service_triggers | true | ||
Service Monitor Client Config Overrides | For advanced use only, a list of configuration properties that will be used by the Service Monitor instead of the current client configuration for the service. | <property><name>dfs.socket.timeout</name><value>3000</value></property><property><name>dfs.datanode.socket.write.timeout</name><value>3000</value></property><property><name>ipc.client.connect.max.retries</name><value>1</value></property><property><name>fs.permissions.umask-mode</name><value>000</value></property> | smon_client_config_overrides | false | ||
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) | For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones. | smon_derived_configs_safety_valve | false |
Other
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
HDFS Block Size | The default block size in bytes for new HDFS files. Note that this value is also used as the HBase Region Server HLog block size. | dfs.blocksize | 128 MiB | dfs_block_size | false | |
Check HDFS Permissions | If false, permission checking is turned off for files in HDFS. | dfs.permissions | true | dfs_permissions | false | |
Default Umask | Default umask for file and directory creation, specified in an octal value (with a leading 0) | fs.permissions.umask-mode | 022 | dfs_umaskmode | false | |
Enable WebHDFS | Enable WebHDFS interface | dfs.webhdfs.enabled | true | dfs_webhdfs_enabled | false | |
Compression Codecs | Comma-separated list of compression codecs that can be used in job or map compression. | io.compression.codecs | org.apache.hadoop.io.compress.DefaultCodec, org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.BZip2Codec, org.apache.hadoop.io.compress.DeflateCodec, org.apache.hadoop.io.compress.SnappyCodec, org.apache.hadoop.io.compress.Lz4Codec | io_compression_codecs | false | |
ZooKeeper Service | Name of the ZooKeeper service that this HDFS service instance depends on | zookeeper_service | false |
Performance
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
DataNode Local Path Access Users | Comma separated list of users allowed to do short circuit read. A short circuit read allows a client co-located with the data to read HDFS file blocks directly from HDFS. If empty, will default to the DataNode process' user. | dfs.block.local-path-access.user | dfs_block_local_path_access_user | false | ||
HDFS File Block Storage Location Timeout | Timeout in milliseconds for the parallel RPCs made in DistributedFileSystem#getFileBlockStorageLocations(). This value is only emitted for Impala. | dfs.client.file-block-storage-locations.timeout.millis | 10 second(s) | dfs_client_file_block_storage_locations_timeout | false | |
Enable HDFS Short Circuit Read | Enable HDFS short circuit read. This allows a client co-located with the DataNode to read HDFS file blocks directly. This gives a performance boost to distributed clients that are aware of locality. | dfs.client.read.shortcircuit | true | dfs_datanode_read_shortcircuit | false | |
UNIX Domain Socket path | Path on the DataNode's local file system to a UNIX domain socket used for communication between the DataNode and local HDFS clients. This socket is used for Short Circuit Reads. Only the HDFS System User and "root" should have write access to the parent directory and all of its ancestors. This property is supported in CDH 4.2 or later deployments. | dfs.domain.socket.path | /var/run/hdfs-sockets/dn | dfs_domain_socket_path | false | |
FsImage Transfer Bandwidth | Maximum bandwidth used for image transfer in bytes per second. This can help keep normal namenode operations responsive during checkpointing. A default value of 0 indicates that throttling is disabled. | dfs.image.transfer.bandwidthPerSec | 0 B | dfs_image_transfer_bandwidthPerSec | false | |
FsImage Transfer Timeout | The amount of time to wait for HDFS filesystem image transfer from NameNode to complete. | dfs.image.transfer.timeout | 1 minute(s) | dfs_image_transfer_timeout | false |
Ports and Addresses
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Use DataNode Hostname | Typically, HDFS clients and servers communicate by opening sockets via an IP address. In certain networking configurations, it is preferable to open sockets after doing a DNS lookup on the hostname. Enable this property to open sockets after doing a DNS lookup on the hostname. This property is supported in CDH3u4 or later deployments. | dfs.client.use.datanode.hostname | false | dfs_client_use_datanode_hostname | false |
Proxy
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
HTTP Proxy User Groups | Comma-delimited list of groups that you want to allow the HTTP user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. This is used by WebHCat. | hadoop.proxyuser.HTTP.groups | * | HTTP_proxy_user_groups_list | false | |
HTTP Proxy User Hosts | Comma-delimited list of hosts where you want to allow the HTTP user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. This is used by WebHCat. | hadoop.proxyuser.HTTP.hosts | * | HTTP_proxy_user_hosts_list | false | |
Flume Proxy User Groups | Allows the flume user to impersonate any members of a comma-delimited list of groups. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.flume.groups | * | flume_proxy_user_groups_list | false | |
Flume Proxy User Hosts | Comma-delimited list of hosts where you want to allow the flume user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.flume.hosts | * | flume_proxy_user_hosts_list | false | |
Hive Proxy User Groups | Comma-delimited list of groups that you want to allow the Hive user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.hive.groups | * | hive_proxy_user_groups_list | false | |
Hive Proxy User Hosts | Comma-delimited list of hosts where you want to allow the Hive user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.hive.hosts | * | hive_proxy_user_hosts_list | false | |
HttpFS Proxy User Groups | Comma-delimited list of groups that you want to allow the HttpFS user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.httpfs.groups | * | httpfs_proxy_user_groups_list | false | |
HttpFS Proxy User Hosts | Comma-delimited list of hosts where you want to allow the HttpFS user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.httpfs.hosts | * | httpfs_proxy_user_hosts_list | false | |
Hue Proxy User Groups | Comma-delimited list of groups that you want to allow the Hue user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.hue.groups | * | hue_proxy_user_groups_list | false | |
Hue Proxy User Hosts | Comma-delimited list of hosts where you want to allow the Hue user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.hue.hosts | * | hue_proxy_user_hosts_list | false | |
Mapred Proxy User Groups | Comma-delimited list of groups that you want to allow the mapred user to impersonate. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.mapred.groups | * | mapred_proxy_user_groups_list | false | |
Mapred Proxy User Hosts | Comma-delimited list of hosts where you want to allow the mapred user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.mapred.hosts | * | mapred_proxy_user_hosts_list | false | |
Oozie Proxy User Groups | Allows the oozie superuser to impersonate any members of a comma-delimited list of groups. The default '*' allows all groups. To disable entirely, use a string that doesn't correspond to a group name, such as '_no_group_'. | hadoop.proxyuser.oozie.groups | * | oozie_proxy_user_groups_list | false | |
Oozie Proxy User Hosts | Comma-delimited list of hosts where you want to allow the oozie user to impersonate other users. The default '*' allows all hosts. To disable entirely, use a string that doesn't correspond to a host name, such as '_no_host'. | hadoop.proxyuser.oozie.hosts | * | oozie_proxy_user_hosts_list | false |
Replication
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Replication Factor | Default block replication. The number of replications to make when the file is created. The default value is used if a replication number is not specified. | dfs.replication | 3 | dfs_replication | false | |
Maximal Block Replication | The maximal block replication. | dfs.replication.max | 512 | dfs_replication_max | false | |
Minimal Block Replication | The minimal block replication. | dfs.namenode.replication.min | 1 | dfs_replication_min | false |
Security
Display Name | Description | Related Name | Default Value | Unit | API Name | Required |
---|---|---|---|---|---|---|
Enable Data Transfer Encryption | Enable encryption of data transfer between DataNodes and clients, and among DataNodes. For effective data transfer protection, enable Kerberos authentication and choose Privacy Quality of RPC Protection. | dfs.encrypt.data.transfer | false | dfs_encrypt_data_transfer | false | |
Data Transfer Encryption Algorithm | Algorithm to encrypt data transfer between DataNodes and clients, and among DataNodes. 3des is more cryptographically secure, but rc4 is substantially faster. | dfs.encrypt.data.transfer.algorithm | rc4 | dfs_encrypt_data_transfer_algorithm | false | |
Superuser Group | The name of the group of superusers. | dfs.permissions.superusergroup | supergroup | dfs_permissions_supergroup | false | |
Additional Rules to Map Kerberos Principals to Short Names | Additional mapping rules that will be inserted before rules generated from the list of trusted realms and before the default rule. After changing this value and restarting the service, any services depending on this one must be restarted as well. The hadoop.security.auth_to_local property is configured using this information. | extra_auth_to_local_rules | false | |||
Authorized Admin Groups | Comma-separated list of groups authorized to perform admin operations on Hadoop. This is emitted only if authorization is enabled. | hadoop_authorized_admin_groups | false | |||
Authorized Admin Users | Comma-separated list of users authorized to perform admin operations on Hadoop. This is emitted only if authorization is enabled. | * | hadoop_authorized_admin_users | false | ||
Authorized Groups | Comma-separated list of groups authorized to used Hadoop. This is emitted only if authorization is enabled. | hadoop_authorized_groups | false | |||
Authorized Users | Comma-separated list of users authorized to used Hadoop. This is emitted only if authorization is enabled. | * | hadoop_authorized_users | false | ||
Hadoop User Group Mapping Search Base | The search base for the LDAP connection. This is a distinguished name, and will typically be the root of the LDAP directory. | hadoop.security.group.mapping.ldap.base | hadoop_group_mapping_ldap_base | false | ||
Hadoop User Group Mapping LDAP Bind User Password | The password of the bind user. | hadoop.security.group.mapping.ldap.bind.password | hadoop_group_mapping_ldap_bind_passwd | false | ||
Hadoop User Group Mapping LDAP Bind User | The distinguished name of the user to bind as when connecting to the LDAP server. This may be left blank if the LDAP server supports anonymous binds. | hadoop.security.group.mapping.ldap.bind.user | hadoop_group_mapping_ldap_bind_user | false | ||
Hadoop User Group Mapping LDAP Group Search Filter | An additional filter to use when searching for groups. | hadoop.security.group.mapping.ldap.search.filter.group | (objectClass=group) | hadoop_group_mapping_ldap_group_filter | false | |
Hadoop User Group Mapping LDAP Group Name Attribute | The attribute of the group object that identifies the group name. The default will usually be appropriate for all LDAP systems. | hadoop.security.group.mapping.ldap.search.attr.group.name | cn | hadoop_group_mapping_ldap_group_name_attr | false | |
Hadoop User Group Mapping LDAP SSL Keystore | File path to the SSL keystore containing the SSL certificate required by the LDAP server. | hadoop.security.group.mapping.ldap.ssl.keystore | hadoop_group_mapping_ldap_keystore | false | ||
Hadoop User Group Mapping LDAP SSL Keystore Password | The password for the SSL keystore. | hadoop.security.group.mapping.ldap.ssl.keystore.password | hadoop_group_mapping_ldap_keystore_passwd | false | ||
Hadoop User Group Mapping LDAP Group Membership Attribute | The attribute of the group object that identifies the users that are members of the group. The default will usually be appropriate for any LDAP installation. | hadoop.security.group.mapping.ldap.search.attr.member | member | hadoop_group_mapping_ldap_member_attr | false | |
Hadoop User Group Mapping LDAP URL | The URL for the LDAP server to use for resolving user groups when using LdapGroupsMapping. | hadoop.security.group.mapping.ldap.url | hadoop_group_mapping_ldap_url | false | ||
Hadoop User Group Mapping LDAP SSL Enabled | Whether or not to use SSL when connecting to the LDAP server. | hadoop.security.group.mapping.ldap.use.ssl | false | hadoop_group_mapping_ldap_use_ssl | false | |
Hadoop User Group Mapping LDAP User Search Filter | An additional filter to use when searching for LDAP users. The default will usually be appropriate for Active Directory installations. If connecting to a generic LDAP server, ''sAMAccountName'' will likely be replaced with ''uid''. {0} is a special string used to denote where the username fits into the filter. | hadoop.security.group.mapping.ldap.search.filter.user | (&(objectClass=user)(sAMAccountName=0)) | hadoop_group_mapping_ldap_user_filter | false | |
Hadoop HTTP Authentication Cookie Domain | The domain to use for the HTTP cookie that stores the authentication token. In order for authentiation to work correctly across all Hadoop nodes' web-consoles the domain must be correctly set. Important: when using IP addresses, browsers ignore cookies with domain settings. For this setting to work properly all nodes in the cluster must be configured to generate URLs with hostname.domain names on it. | hadoop.http.authentication.cookie.domain | hadoop_http_auth_cookie_domain | false | ||
Hadoop RPC Protection | Quality of protection for secured RPC connections between NameNode and HDFS clients. For effective RPC protection, enable Kerberos authentication. | hadoop.rpc.protection | authentication | hadoop_rpc_protection | false | |
Enable Authentication for HTTP Web-Consoles | Enables authentication for hadoop HTTP web-consoles for all roles of this service. Note: This is effective only if security is enabled for the HDFS service. | false | hadoop_secure_web_ui | false | ||
Hadoop Secure Authentication | Choose the authentication mechanism used by Hadoop | hadoop.security.authentication | simple | hadoop_security_authentication | false | |
Hadoop Secure Authorization | Enable authorization | hadoop.security.authorization | false | hadoop_security_authorization | false | |
Hadoop User Group Mapping Implementation | Class for user to group mapping (get groups for a given user). | hadoop.security.group.mapping | org.apache.hadoop.security.ShellBasedUnixGroupsMapping | hadoop_security_group_mapping | false | |
HDFS User to Impersonate | The user the management services will impersonate as when connecting to HDFS. Defaults to 'hdfs', a superuser. | hdfs.user.to.impersonate | hdfs | hdfs_user_to_impersonate | false | |
Hue's Kerberos Principal Short Name | The short name of Hue's Kerberos principal | hue.kerberos.principal.shortname | hue | hue_kerberos_principal_shortname | false | |
Trusted Kerberos Realms | List of Kerberos realms that Hadoop services should trust. If empty, defaults to the default_realm property configured in the krb5.conf file. After changing this value and restarting the service, all services depending on this service must also be restarted. Adds mapping rules for each domain to the hadoop.security.auth_to_local property in core-site.xml. | trusted_realms | false |
<< HBase | Hive >> | |