Cloudera Management Service

Activity Monitor

Advanced

Display Name Description Related Name Default Value API Name Required
Activity Monitor Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. ACTIVITYMONITOR_role_env_safety_valve false
Event Publication Maximum Queue Size The maximum size of the queue in which events published from this role will be buffered. If this queue becomes full (for example, due to an outage), subsequent events will be dropped. activityevents.event.publish.queue.max 20000 actmon_event_publication_queue_size_max true
Event Publication Retry Period If an event cannot be delivered immediately by this role, this value controls how long to wait before Event Publisher retries delivery. activityevents.event.publish.retry.ms 5000 actmon_event_publication_retry_period true
Java Configuration Options for Activity Monitor These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. firehose_java_opts false
Activity Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf For advanced use only. A string to be inserted into cmon.conf for this role only. firehose_safety_valve false
Activity Monitor Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Database

Display Name Description Related Name Default Value API Name Required
Activity Monitor Database Hostname Name of host where Activity Monitor's database is running. It is highly recommended that this database is on the same host as the Activity Monitor. If the database is not running on its default port, specify the port number using this syntax: 'host:port' localhost firehose_database_host false
Activity Monitor Database Name Name of the Activity Monitor's database. firehose_database_name true
Activity Monitor Database Password Password for logging in to the Activity Monitor database db.hibernate.connection.password firehose_database_password false
Activity Monitor Database Type Type of database to use for Activity Monitor. mysql firehose_database_type false
Activity Monitor Database Username Username for logging in to the Activity Monitor database. db.hibernate.connection.username firehose_database_user true

Logs

Display Name Description Related Name Default Value API Name Required
Activity Monitor Logging Threshold The minimum log level for Activity Monitor logs INFO log_threshold false
Activity Monitor Maximum Log File Backups The maximum number of rolled log files to keep for Activity Monitor logs. Typically used by log4j or logback. 10 max_log_backup_index false
Activity Monitor Max Log Size The maximum size, in megabytes, per log file for Activity Monitor logs. Typically used by log4j or logback. 200 MiB max_log_size false
Activity Monitor Log Directory Location of log files for Activity Monitor /var/log/cloudera-scm-firehose mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Activity Monitor Activity Monitor Pipeline Monitoring Thresholds The health test thresholds for monitoring the Activity Monitor activity monitor pipeline. This specifies the number of dropped messages that will be tolerated over the monitoring time period. Warning: Never, Critical: Any activitymonitor_activity_monitor_pipeline_thresholds false
Activity Monitor Activity Monitor Pipeline Monitoring Time Period The time period over which the Activity Monitor activity monitor pipeline will be monitored for dropped messages. 5 minute(s) activitymonitor_activity_monitor_pipeline_window false
Activity Monitor Activity Tree Pipeline Monitoring Thresholds The health test thresholds for monitoring the Activity Monitor activity tree pipeline. This specifies the number of dropped messages that will be tolerated over the monitoring time period. Warning: Never, Critical: Any activitymonitor_activity_tree_pipeline_thresholds false
Activity Monitor Activity Tree Pipeline Monitoring Time Period The time period over which the Activity Monitor activity tree pipeline will be monitored for dropped messages. 5 minute(s) activitymonitor_activity_tree_pipeline_window false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % activitymonitor_fd_thresholds false
Activity Monitor Host Health Test When computing the overall Activity Monitor health, consider the host's health. true activitymonitor_host_health_enabled false
Pause Duration Thresholds The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 activitymonitor_pause_duration_thresholds false
Pause Duration Monitoring Period The period to review when computing the moving average of extra time the pause monitor spent paused. 5 minute(s) activitymonitor_pause_duration_window false
Activity Monitor Process Health Test Enables the health test that the Activity Monitor's process state is consistent with the role configuration true activitymonitor_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true activitymonitor_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never activitymonitor_web_metric_collection_thresholds false
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Cloudera Manager Descriptor Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager descriptor was last refreshed. Warning: 60000.0, Critical: 120000.0 scm_descriptor_age_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Event Publication Log Quiet Time Period To avoid producing excessive amounts of log output, the Event Publisher component of this role is limited to emitting one message per time period. This value controls the size of that time period. activityevents.event.publish.log.suppress.window.ms 1 minute(s) actmon_event_publication_log_suppress_window true
Use the Authentication Service to enable Single Sign On Use the Authentication Service to enable Single Sign On for the Firehose debug servers. Requires a running Authentication Service. debug.servlet.auth.enabled false debug_servlet_auth_enabled false
Purge Activities Data at This Age In Activity Monitor, purge data about MapReduce jobs and aggregate activities when the data reaches this age in hours. By default, Activity Monitor keeps data about activities for 336 hours (14 days). firehose.activity.purge.duration.hours 14 day(s) firehose_activity_purge_duration_hours false
Purge Attempts Data at This Age In the Activity Monitor, purge data about MapReduce attempts when the data reaches this age in hours. Because attempt data may consume large amounts of database space, you may wish to purge it more frequently than activity data. By default, Activity Monitor keeps data about attempts for 336 hours (14 days). firehose.attempt.purge.duration.hours 14 day(s) firehose_attempt_purge_duration_hours false
Descriptor Fetch Tries Interval The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting. mgmt.descriptor.fetch.frequency 2 second(s) mgmt_descriptor_fetch_frequency true
Descriptor Fetch Max Tries Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit. mgmt.num.descriptor.fetch.tries 5 mgmt_num_descriptor_fetch_tries true
Purge MapReduce Service Data at This Age The number of hours of past service-level data to keep in the Activity Monitor database, such as total slots running. The default is to keep data for 336 hours (14 days). timeseries.expiration.hours 14 day(s) timeseries_expiration_hours false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Bind Activity Monitor to Wildcard Address If enabled, the Activity Monitor binds to the wildcard address ("0.0.0.0") on all of its ports. false amon_bind_wildcard false
Activity Monitor Web UI Port Port for Activity Monitor's Debug page. Set to -1 to disable the debug server. debug.servlet.port 8087 firehose_debug_port false
Activity Monitor Web UI HTTPS Port Port for Activity Monitor's HTTPS Debug page. debug.servlet.https.port 9087 firehose_debug_tls_port false
Activity Monitor Listen Port Port where Activity Monitor is listening for agent messages. firehose.server.port 9999 firehose_listen_port false
Activity Monitor Nozzle Port Port where Activity Monitor's query API is exposed. nozzle.server.port 9998 firehose_nozzle_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Activity Monitor in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB firehose_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Activity Monitor Kerberos Principal Kerberos principal used by the Activity Monitor. Note: Activity Monitoring should always use the principal used by Hue service. hue kerberos_role_princ_name true
Enable TLS/SSL for Firehose Debug Server Encrypt communication between clients and Firehose Debug Server using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). debug.servlet.https.enabled false ssl_enabled false
Firehose Debug Server TLS/SSL Server JKS Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when Firehose Debug Server is acting as a TLS/SSL server. The keystore must be in JKS format. debug.servlet.https.keystorePath ssl_server_keystore_location false
Firehose Debug Server TLS/SSL Server JKS Keystore File Password The password for the Firehose Debug Server JKS keystore file. debug.servlet.https.keystorePassword ssl_server_keystore_password false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Parameter Validation: Activity Monitor Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_activitymonitor_role_env_safety_valve true
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Activity Monitor Database Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Database Hostname parameter. false role_config_suppression_firehose_database_host true
Suppress Parameter Validation: Activity Monitor Database Name Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Database Name parameter. false role_config_suppression_firehose_database_name true
Suppress Parameter Validation: Activity Monitor Database Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Database Password parameter. false role_config_suppression_firehose_database_password true
Suppress Parameter Validation: Activity Monitor Database Username Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Database Username parameter. false role_config_suppression_firehose_database_user true
Suppress Parameter Validation: Java Configuration Options for Activity Monitor Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Activity Monitor parameter. false role_config_suppression_firehose_java_opts true
Suppress Parameter Validation: Activity Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf parameter. false role_config_suppression_firehose_safety_valve true
Suppress Parameter Validation: Activity Monitor Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Activity Monitor Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Activity Monitor Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Activity Monitor Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Activity Monitor Pipeline Whether to suppress the results of the Activity Monitor Pipeline heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_activity_monitor_pipeline true
Suppress Health Test: Activity Tree Pipeline Whether to suppress the results of the Activity Tree Pipeline heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_activity_tree_pipeline true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_log_directory_free_space true
Suppress Health Test: Pause Duration Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_pause_duration true
Suppress Health Test: Cloudera Manager Descriptor Age Whether to suppress the results of the Cloudera Manager Descriptor Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_scm_descriptor_fetch true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_activity_monitor_web_metric_collection true

Alert Publisher

Advanced

Display Name Description Related Name Default Value API Name Required
Java Configuration Options for Alert Publisher These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. alertpublisher_java_opts false
Alert Publisher Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. ALERTPUBLISHER_role_env_safety_valve false
Alert Publisher Advanced Configuration Snippet (Safety Valve) for alertpublisher.conf For advanced use only. A string to be inserted into alertpublisher.conf for this role only. alertpublisher_safety_valve false
Alert Publisher Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Logs

Display Name Description Related Name Default Value API Name Required
Alert Publisher Logging Threshold The minimum log level for Alert Publisher logs INFO log_threshold false
Alert Publisher Maximum Log File Backups The maximum number of rolled log files to keep for Alert Publisher logs. Typically used by log4j or logback. 10 max_log_backup_index false
Alert Publisher Max Log Size The maximum size, in megabytes, per log file for Alert Publisher logs. Typically used by log4j or logback. 200 MiB max_log_size false
Alert Publisher Log Directory Directory where Alert Publisher will place its log files. /var/log/cloudera-scm-alertpublisher mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % alertpublisher_fd_thresholds false
Alert Publisher Host Health Test When computing the overall Alert Publisher health, consider the host's health. true alertpublisher_host_health_enabled false
Alert Publisher Process Health Test Enables the health test that the Alert Publisher's process state is consistent with the role configuration true alertpublisher_scm_health_enabled false
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Alerts: Enable Email Alerts This setting allows you to turn email alert delivery on and off. mailserver.enabled true alert_mailserver_enabled false
Alert: Mail From Address The 'From' address to use for alert emails noreply@localhost alert_mailserver_from_address false
Alerts: Mail Server Hostname The IP address or hostname of the mail server to send alerts to localhost alert_mailserver_hostname true
Alerts: Mail Server Password The password to use to log into the mail server. Warning: this password will be sent over the network to the Alert Publisher host in clear text. In addition, the password will be stored in a plain text file on the Alert Publisher host with restrictive file system permissions. alert_mailserver_password false
Alerts: Mail Server Protocol The protocol to use for sending email alerts. smtp alert_mailserver_protocol true
Alerts: Mail Message Recipients A comma-separated list of email addresses to send alerts to root@localhost alert_mailserver_recipients true
Alerts: Mail Server Username The username to use to log into the mail server alert_mailserver_username false
Custom Alert Script If configured, this script is invoked on the machine hosting the alert publisher role. The script must be readable and executable by the cloudera-scm user. The script is passed, as a single argument, a path to a UTF-8 JSON file containing a list of alerts. Alerts are, by default, batched over time, and the batch size and the batch interval are configurable with the "Alert Publisher: Maximum Batch Size" and "Alert Publisher: Maximum Batch Interval" configuration options. The alerts file is deleted when the script finishes executing. Only one instance of this script is invoked at any given time, and the script must terminate. The standard out and standard error messages from this script are logged to the alert publisher role's log file. alert.script.path alert_script_path false
Alert Publisher: Maximum Batch Size The Alert Publisher can be configured to batch multiple alerts into a single email. This setting specifies the maximum number of alerts that will be batched into a single email (regardless of the batch interval). alert.aggregate.maxSize 32 alertpublisher_aggregate_max_size false
Alert Publisher: Maximum Batch Interval The Alert Publisher can be configured to batch multiple alerts into a single email. This setting specifies the maximum amount of time (in milliseconds) that the Alert Publisher waits before sending an email of the current batch. alert.aggregate.timeout.millis 1 minute(s) alertpublisher_aggregate_timeout false
Alerts: Email footer Optional. If not empty, the text entered here will be inserted verbatim as a footer in HTML and plain-text emails. alert.email.footer alertpublisher_email_footer false
Alerts: Email header Optional. If not empty, the text entered here will be inserted verbatim as a header in HTML and plain-text emails. alert.email.header alertpublisher_email_header false
Alerts: Mail Message Format The format of the email alert message. The 'JSON' format is easy for scripts/programs to parse. The 'HTML' and 'text' formats are designed to be easily read by people. mail.format html mail_format true

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Alerts: Mail Server TCP Port Optional. The TCP port where the mail server is listening. If not specified, defaults to 25 if SMTP is selected, or 465 if SMTPS is selected. alert_mailserver_port false
Alerts: Listen Port Port where the Alert Publisher listens for internal API requests. alertpublisher.internalapi.port 10101 alertpublisher_internalapi_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Alert Publisher in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 256 MiB alert_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

SNMP

Display Name Description Related Name Default Value API Name Required
SNMP Authentication Protocol Pass Phrase Pass phrase to use for SNMP authentication protocol alert.snmp.auth.password alert_snmp_auth_password false
SNMP Authentication Protocol Authentication algorithm to use for authentication alert.snmp.auth.protocol SHA alert_snmp_auth_protocol false
SNMPv2 Community String Community string to use to identify this service. Generated SNMPv2 traps will use this string for authentication purpose. alert.snmp.community alert_snmp_community false
SNMP Retry Count Number of time to try before trap is timed out. If this value is set to '0' the trap will be sent only once. alert.snmp.retries 0 alert_snmp_retries true
SNMP Server Engine Id Engine Id to use for authentication and privacy. Engine Id is normally a hexadecimal number (e.g. 8000173e03a0c095f80c68). Engine Id along with pass phrases are used to generate keys for authentication and privacy protocols. alert.snmp.security.engineid alert_snmp_security_engineid false
SNMP Security Level Level of security to use for SNMP v3 protocol. Currently only 'no authentication' and 'authentication with no privacy' is supported. Select 'SNMPv2' to use 'Community String' based SNMPv2 authentication. alert.snmp.security.level SNMPv2 alert_snmp_security_level true
SNMP NMS Hostname Hostname of the SNMP NMS (network management software). It can be a DNS name or IP address of the host listening for SNMP traps and notifications. For reference, here is Cloudera Manager SNMP Mib . alert.snmp.server.hostname alert_snmp_server_hostname false
SNMP Server Port Port number on which SNMP server is listening. alert.snmp.server.port 162 alert_snmp_server_port true
SNMP Timeout Time to wait before an SNMP trap is resent or timed out. alert.snmp.timeout 5 second(s) alert_snmp_timeout true
SNMP Security UserName Name of a user to use for SNMP security. alert.snmp.username alert_snmp_username false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Parameter Validation: Alert: Mail From Address Whether to suppress configuration warnings produced by the built-in parameter validation for the Alert: Mail From Address parameter. false role_config_suppression_alert_mailserver_from_address true
Suppress Parameter Validation: Alerts: Mail Server Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Mail Server Hostname parameter. false role_config_suppression_alert_mailserver_hostname true
Suppress Parameter Validation: Alerts: Mail Server Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Mail Server Password parameter. false role_config_suppression_alert_mailserver_password true
Suppress Parameter Validation: Alerts: Mail Message Recipients Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Mail Message Recipients parameter. false role_config_suppression_alert_mailserver_recipients true
Suppress Parameter Validation: Alerts: Mail Server Username Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Mail Server Username parameter. false role_config_suppression_alert_mailserver_username true
Suppress Parameter Validation: Custom Alert Script Whether to suppress configuration warnings produced by the built-in parameter validation for the Custom Alert Script parameter. false role_config_suppression_alert_script_path true
Suppress Parameter Validation: SNMP Authentication Protocol Pass Phrase Whether to suppress configuration warnings produced by the built-in parameter validation for the SNMP Authentication Protocol Pass Phrase parameter. false role_config_suppression_alert_snmp_auth_password true
Suppress Parameter Validation: SNMPv2 Community String Whether to suppress configuration warnings produced by the built-in parameter validation for the SNMPv2 Community String parameter. false role_config_suppression_alert_snmp_community true
Suppress Parameter Validation: SNMP Server Engine Id Whether to suppress configuration warnings produced by the built-in parameter validation for the SNMP Server Engine Id parameter. false role_config_suppression_alert_snmp_security_engineid true
Suppress Parameter Validation: SNMP NMS Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the SNMP NMS Hostname parameter. false role_config_suppression_alert_snmp_server_hostname true
Suppress Parameter Validation: SNMP Security UserName Whether to suppress configuration warnings produced by the built-in parameter validation for the SNMP Security UserName parameter. false role_config_suppression_alert_snmp_username true
Suppress Parameter Validation: Alerts: Email footer Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Email footer parameter. false role_config_suppression_alertpublisher_email_footer true
Suppress Parameter Validation: Alerts: Email header Whether to suppress configuration warnings produced by the built-in parameter validation for the Alerts: Email header parameter. false role_config_suppression_alertpublisher_email_header true
Suppress Parameter Validation: Java Configuration Options for Alert Publisher Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Alert Publisher parameter. false role_config_suppression_alertpublisher_java_opts true
Suppress Parameter Validation: Alert Publisher Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Alert Publisher Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_alertpublisher_role_env_safety_valve true
Suppress Parameter Validation: Alert Publisher Advanced Configuration Snippet (Safety Valve) for alertpublisher.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Alert Publisher Advanced Configuration Snippet (Safety Valve) for alertpublisher.conf parameter. false role_config_suppression_alertpublisher_safety_valve true
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Alert Publisher Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Alert Publisher Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Alert Publisher Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Alert Publisher Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Configuration Validator: SNMP Validator Whether to suppress configuration warnings produced by the SNMP Validator configuration validator. false role_config_suppression_snmp_validator true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_alert_publisher_unexpected_exits true

Event Server

Advanced

Display Name Description Related Name Default Value API Name Required
Java Configuration Options for Event Server These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. eventserver_java_opts false
Maximum Number of Events Returned by Any Query The maximum number of events that any query can return. Note: A high value can increase the amount of memory required by Event Server, as well as affect query response times. eventcatcher.max.query.events 10000 eventserver_max_query_events true
Maximum Write Queue Length The maximum number of events that can be queued for write before further requests are rejected eventcatcher.ingest.pipeline.max 10000 eventserver_max_write_queue_size true
Number of Core Event Writer Threads The number of threads that Event Server will use to write events to its store concurrently eventcatcher.num.ingest.threads 2 eventserver_num_pipeline_threads true
Event Server Query Timeout The amount of time, in milliseconds, that Cloudera Manager and the Alert Publisher will wait for the Event Server to respond to a query. eventserver.query.timeout 60000 eventserver_query_timeout false
Event Server Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. EVENTSERVER_role_env_safety_valve false
Event Server Advanced Configuration Snippet (Safety Valve) for eventserver.conf For advanced use only. A string to be inserted into eventserver.conf for this role only. eventserver_safety_valve false
Event Server Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Logs

Display Name Description Related Name Default Value API Name Required
Event Server Logging Threshold The minimum log level for Event Server logs INFO log_threshold false
Event Server Maximum Log File Backups The maximum number of rolled log files to keep for Event Server logs. Typically used by log4j or logback. 10 max_log_backup_index false
Event Server Max Log Size The maximum size, in megabytes, per log file for Event Server logs. Typically used by log4j or logback. 200 MiB max_log_size false
Event Server Log Directory Directory where Event Server will place its log files. /var/log/cloudera-scm-eventserver mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Event Store Capacity Monitoring Thresholds The health test thresholds on the number of events in the event store. Specified as a percentage of the maximum number of events in Event Server store. Warning: 115.0 %, Critical: 130.0 % eventserver_capacity_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % eventserver_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 eventserver_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) eventserver_gc_duration_window false
Event Server Host Health Test When computing the overall Event Server health, consider the host's health. true eventserver_host_health_enabled false
Event Server Index Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Event Server Index Directory. Warning: 10 GiB, Critical: 5 GiB eventserver_index_directory_free_space_absolute_thresholds false
Event Server Index Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Event Server Index Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Event Server Index Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never eventserver_index_directory_free_space_percentage_thresholds false
Event Server Process Health Test Enables the health test that the Event Server's process state is consistent with the role configuration true eventserver_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true eventserver_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never eventserver_web_metric_collection_thresholds false
Event Server Write Pipeline Monitoring Thresholds The health test thresholds for monitoring the Event Server write pipeline. This specifies the number of dropped messages that will be tolerated over the monitoring time period. Warning: Never, Critical: Any eventserver_write_pipeline_thresholds false
Event Server Write Pipeline Monitoring Time Period The time period over which the Event Server write pipeline will be monitored for dropped messages. 5 minute(s) eventserver_write_pipeline_window false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Cloudera Manager Descriptor Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager descriptor was last refreshed. Warning: 60000.0, Critical: 120000.0 scm_descriptor_age_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Alert On Transitions Out of Alerting Health If set, the health events for transitions out of an alertable health level will also be considered an alert. For example, consider an entity that is configured to alert when it has bad health. If that entity's health becomes bad, an alert will be generated. If this setting is enabled, an alert will also be generated when it returns to good health. If this setting is disabled, then no alert will be generated when it returns to good health. Note that an entity must have enable_alerts set to true for health alerts to be generated for it. And make sure to reference the per-entity setting to turn on health alerts. false eventserver_alert_on_transition_out_of_alerting_health_enabled false
Health Alert Threshold Threshold at which a health event will be considered an alert. Note that an entity must have enable_alerts set to true for health alerts to be generated for it. And make sure to reference the per-entity setting to turn on health alerts. Bad eventserver_health_events_alert_threshold false
Event Server Index Directory Location of the Lucene index for Event Server eventcatcher.server.lucenedir /var/lib/cloudera-scm-eventserver eventserver_index_dir false
Maximum Number of Events in the Event Server Store The maximum size of the Event Server store, in events. Once this size is exceeded, events is deleted started with the oldest first until the size of the store returns below this threshold eventcatcher.event.capacity 5000000 eventserver_max_index_size true
Descriptor Fetch Tries Interval The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting. mgmt.descriptor.fetch.frequency 2 second(s) mgmt_descriptor_fetch_frequency true
Descriptor Fetch Max Tries Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit. mgmt.num.descriptor.fetch.tries 5 mgmt_num_descriptor_fetch_tries true

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Event Server Web UI Port Port for the Event Server's Debug page. Set to -1 to disable debug server. eventcatcher.server.debug.port 8084 eventserver_debug_port false
Event Query Port Port on which the Event Server listens for queries for events. eventcatcher.server.httpport 7185 eventserver_http_port false
Event Publish Port Port on which the Event Server listens for the publication of events. eventcatcher.server.port 7184 eventserver_listen_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of EventServer in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB event_server_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Event Server Index Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Event Server Index Directory parameter. false role_config_suppression_eventserver_index_dir true
Suppress Parameter Validation: Java Configuration Options for Event Server Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Event Server parameter. false role_config_suppression_eventserver_java_opts true
Suppress Parameter Validation: Event Server Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Event Server Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_eventserver_role_env_safety_valve true
Suppress Parameter Validation: Event Server Advanced Configuration Snippet (Safety Valve) for eventserver.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Event Server Advanced Configuration Snippet (Safety Valve) for eventserver.conf parameter. false role_config_suppression_eventserver_safety_valve true
Suppress Parameter Validation: Event Server Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Event Server Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Event Server Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Event Server Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_audit_health true
Suppress Health Test: Event Store Size Whether to suppress the results of the Event Store Size heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_event_store_size true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_host_health true
Suppress Health Test: Event Server Index Directory Free Space Whether to suppress the results of the Event Server Index Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_index_directory_free_space true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_log_directory_free_space true
Suppress Health Test: Cloudera Manager Descriptor Age Whether to suppress the results of the Cloudera Manager Descriptor Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_scm_descriptor_fetch true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_web_metric_collection true
Suppress Health Test: Write Pipeline Whether to suppress the results of the Write Pipeline heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_event_server_write_pipeline true

Host Monitor

Advanced

Display Name Description Related Name Default Value API Name Required
Java Configuration Options for Host Monitor These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. firehose_java_opts false
Host Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf For advanced use only. A string to be inserted into cmon.conf for this role only. firehose_safety_valve false
Host Monitor Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. HOSTMONITOR_role_env_safety_valve false
Host Monitor Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Event Publication Maximum Queue Size The maximum size of the queue in which events published from this role will be buffered. If this queue becomes full (for example, due to an outage), subsequent events will be dropped. health.event.publish.queue.max 20000 svcmon_event_publication_queue_size_max true
Event Publication Retry Period If an event cannot be delivered immediately by this role, this value controls how long to wait before Event Publisher retries delivery. health.event.publish.retry.ms 5000 svcmon_event_publication_retry_period true

Logs

Display Name Description Related Name Default Value API Name Required
Host Monitor Logging Threshold The minimum log level for Host Monitor logs INFO log_threshold false
Host Monitor Maximum Log File Backups The maximum number of rolled log files to keep for Host Monitor logs. Typically used by log4j or logback. 10 max_log_backup_index false
Host Monitor Max Log Size The maximum size, in megabytes, per log file for Host Monitor logs. Typically used by log4j or logback. 200 MiB max_log_size false
Host Monitor Log Directory Location of log files for Host Monitor /var/log/cloudera-scm-firehose mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Metrics Aggregation Run Duration Thresholds The health test thresholds for monitoring the metrics aggregation run duration. Warning: 10 second(s), Critical: 30 second(s) aggregation_run_duration_thresholds false
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Host Monitor Storage Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Host Monitor Storage Directory. Warning: 10 GiB, Critical: 5 GiB firehose_storage_directory_free_space_absolute_thresholds false
Host Monitor Storage Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Host Monitor Storage Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Host Monitor Storage Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never firehose_storage_directory_free_space_percentage_thresholds false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % hostmonitor_fd_thresholds false
Host Monitor Host Health Test When computing the overall Host Monitor health, consider the host's health. true hostmonitor_host_health_enabled false
Host Monitor Host Pipeline Monitoring Thresholds The health test thresholds for monitoring the Host Monitor host pipeline. This specifies the number of dropped messages that will be tolerated over the monitoring time period. Warning: Never, Critical: Any hostmonitor_host_pipeline_thresholds false
Host Monitor Host Pipeline Monitoring Time Period The time period over which the Host Monitor host pipeline will be monitored for dropped messages. 5 minute(s) hostmonitor_host_pipeline_window false
Pause Duration Thresholds The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 hostmonitor_pause_duration_thresholds false
Pause Duration Monitoring Period The period to review when computing the moving average of extra time the pause monitor spent paused. 5 minute(s) hostmonitor_pause_duration_window false
Host Monitor Process Health Test Enables the health test that the Host Monitor's process state is consistent with the role configuration true hostmonitor_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true hostmonitor_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never hostmonitor_web_metric_collection_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Cloudera Manager Metric Schema Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager metric schema was last refreshed. Warning: 60000.0, Critical: 120000.0 metric_schema_age_thresholds_name false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Cloudera Manager Descriptor Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager descriptor was last refreshed. Warning: 60000.0, Critical: 120000.0 scm_descriptor_age_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Use the Authentication Service to enable Single Sign On Use the Authentication Service to enable Single Sign On for the Firehose debug servers. Requires a running Authentication Service. debug.servlet.auth.enabled false debug_servlet_auth_enabled false
Host Monitor Storage Directory The directory where Host Monitor data is stored. The Host Monitor stores metric time series and health information. firehose.storage.base.directory /var/lib/cloudera-host-monitor firehose_storage_dir true
Time-Series Storage The approximate amount of disk space dedicated to storing time series and health data. Once the store has reached its maximum size, older data is deleted to make room for newer data. The disk usage is approximate because data is deleted only when the limit is reached.Note that Cloudera Manager stores time-series data at a number of different data granularities, and these granularities have different effective retention periods. Specifically, Cloudera Manager stores metric data as both raw data points and ten-minutely, hourly, six-hourly, daily, and weekly summary data points. Raw data consumes the bulk of the allocated storage space, weekly summaries the least. As such, raw data is retained for the shortest amount of time, while weekly summary points are unlikely to ever be deleted.See the "Storage" tab on the 'Host Monitor' -> 'Charts Library' -> 'Host Monitor Storgae' page for more information on how space is consumed within the Host Monitor. This tab also shows information about the amount of data retained and time window covered by each data granularity. firehose_time_series_storage_bytes 10 GiB firehose_time_series_storage_bytes false
Health Event Startup Policy This setting controls whether health events are emitted when this monitoring role is started. If set to "none", then no health events are emitted. If set to "bad" then health events are emitted for subjects with bad or concerning health. If set to "all" then health events are emitted for all subjects for all health values. The default is "bad". health.event.publish.startup.policy bad health_event_publish_startup_policy false
Descriptor Fetch Tries Interval The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting. mgmt.descriptor.fetch.frequency 2 second(s) mgmt_descriptor_fetch_frequency true
Descriptor Fetch Max Tries Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit. mgmt.num.descriptor.fetch.tries 5 mgmt_num_descriptor_fetch_tries true
Event Publication Log Quiet Time Period To avoid producing excessive amounts of log output, the Event Publisher component of this role is limited to emitting one message per time period. This value controls the size of that time period. health.event.publish.log.suppress.window.ms 1 minute(s) svcmon_event_publication_log_suppress_window true

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Host Monitor Web UI Port Port for Host Monitor's Debug page. Set to -1 to disable the debug server. debug.servlet.port 8091 firehose_debug_port false
Host Monitor Web UI HTTPS Port Port for Host Monitor's HTTPS Debug page. debug.servlet.https.port 9091 firehose_debug_tls_port false
Host Monitor Listen Port Port where Host Monitor is listening for agent messages. firehose.server.port 9995 firehose_listen_port false
Host Monitor Nozzle Port Port where Host Monitor's query API is exposed. nozzle.server.port 9994 firehose_nozzle_port false
Bind Host Monitor to Wildcard Address If enabled, the Host Monitor binds to the wildcard address ("0.0.0.0") on all of its ports. false hmon_bind_wildcard false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Host Monitor in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB firehose_heapsize false
Maximum Non-Java Memory of Host Monitor The amount of memory the Host Monitor can use off of the Java heap. firehose_non_java_memory_bytes 2 GiB firehose_non_java_memory_bytes false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Enable TLS/SSL for Firehose Debug Server Encrypt communication between clients and Firehose Debug Server using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). debug.servlet.https.enabled false ssl_enabled false
Firehose Debug Server TLS/SSL Server JKS Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when Firehose Debug Server is acting as a TLS/SSL server. The keystore must be in JKS format. debug.servlet.https.keystorePath ssl_server_keystore_location false
Firehose Debug Server TLS/SSL Server JKS Keystore File Password The password for the Firehose Debug Server JKS keystore file. debug.servlet.https.keystorePassword ssl_server_keystore_password false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Configuration Validator: Host Monitor Heap Size Validator Whether to suppress configuration warnings produced by the Host Monitor Heap Size Validator configuration validator. false role_config_suppression_firehose_host_monitor_heap_role_validator true
Suppress Configuration Validator: Host Monitor Off Heap Memory Size Validator Whether to suppress configuration warnings produced by the Host Monitor Off Heap Memory Size Validator configuration validator. false role_config_suppression_firehose_host_monitor_non_java_memory_role_validator true
Suppress Parameter Validation: Java Configuration Options for Host Monitor Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Host Monitor parameter. false role_config_suppression_firehose_java_opts true
Suppress Parameter Validation: Host Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Host Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf parameter. false role_config_suppression_firehose_safety_valve true
Suppress Parameter Validation: Host Monitor Storage Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Host Monitor Storage Directory parameter. false role_config_suppression_firehose_storage_dir true
Suppress Parameter Validation: Host Monitor Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Host Monitor Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_hostmonitor_role_env_safety_valve true
Suppress Parameter Validation: Host Monitor Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Host Monitor Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Host Monitor Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Host Monitor Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Metrics Aggregation Run Duration Test Whether to suppress the results of the Metrics Aggregation Run Duration Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_aggregation_run_duration true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_host_health true
Suppress Health Test: Host Pipeline Whether to suppress the results of the Host Pipeline heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_host_pipeline true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_log_directory_free_space true
Suppress Health Test: Cloudera Manager Metric Schema Age Whether to suppress the results of the Cloudera Manager Metric Schema Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_metric_schema_fetch true
Suppress Health Test: Pause Duration Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_pause_duration true
Suppress Health Test: Cloudera Manager Descriptor Age Whether to suppress the results of the Cloudera Manager Descriptor Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_scm_descriptor_fetch true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_scm_health true
Suppress Health Test: Host Monitor Storage Directory Free Space Whether to suppress the results of the Host Monitor Storage Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_storage_directory_free_space true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_host_monitor_web_metric_collection true

Navigator Audit Server

Advanced

Display Name Description Related Name Default Value API Name Required
Navigator Audit Server Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for db.navigator.properties For advanced use only. A string to be inserted into db.navigator.properties for this role only. navigator_db_safety_valve false
Java Configuration Options for Navigator Audit These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. navigator_java_opts false
Navigator Audit Server Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. NAVIGATOR_role_env_safety_valve false
Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties For advanced use only. A string to be inserted into cloudera-navigator.properties for this role only. navigator_server_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
PII Masking Regular Expression Regular expression that identifies the strings to be masked. Changing this expression does not mask the strings in previous entries. Leave blank to bypass masking. This feature is superseded by cluster-wide redaction of logs and SQL queries, as an HDFS service-wide configuration parameter. navigator.pii.masking.regex (4[0-9]12(?:[0-9]3)?)|(5[1-5][0-9]14)|(3[47][0-9]13)|(3(?:0[0-5]|[68][0-9])[0-9]11)|(6(?:011|5[0-9]2)[0-9]12)|((?:2131|1800|35\d3)\d11) pii_masking_regex false
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Database

Display Name Description Related Name Default Value API Name Required
Navigator Audit Server Database Hostname Name of the host where Navigator Audit Server's database is running. It is highly recommended that this database is on the same host as Navigator Audit Server. If the database is not running on its default port, specify the port number using this syntax: 'host:port' navigator.db.host localhost navigator_database_host false
Navigator Audit Server Database Name The name of the Navigator Audit Server's database. navigator.db.name nav navigator_database_name true
Navigator Audit Server Database Password The password for Navigator Audit Server's database user account. navigator.db.password navigator_database_password false
Navigator Audit Server Database Type Type of database used for Navigator Audit Server. navigator.db.type mysql navigator_database_type false
Navigator Audit Server Database Username The username to use to log into Navigator Audit Server's database. navigator.db.user nav navigator_database_user true

Logs

Display Name Description Related Name Default Value API Name Required
Navigator Audit Server Logging Threshold The minimum log level for Navigator Audit Server logs INFO log_threshold false
Navigator Audit Server Maximum Log File Backups The maximum number of rolled log files to keep for Navigator Audit Server logs. Typically used by log4j or logback. 10 max_log_backup_index false
Navigator Audit Server Max Log Size The maximum size, in megabytes, per log file for Navigator Audit Server logs. Typically used by log4j or logback. 200 MiB max_log_size false
Navigator Audit Server Log Directory Directory where Navigator Audit Server will place its log files. /var/log/cloudera-scm-navigator mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % navigator_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 navigator_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) navigator_gc_duration_window false
Navigator Audit Server Host Health Test When computing the overall Navigator Audit Server health, consider the host's health. true navigator_host_health_enabled false
Navigator Audit Server Process Health Test Enables the health test that the Navigator Audit Server's process state is consistent with the role configuration true navigator_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true navigator_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never navigator_web_metric_collection_thresholds false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Navigator Audit Server Data Expiration Period The number of hours of past audit events to keep in the Navigator Audit Server database. This will affect the size of the database. navigator.db.hours.retained 90 day(s) hours_retained false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Navigator Audit Server Web UI Port The port where Navigator Audit Server starts a debug web server. Set to -1 to disable debug server. navigator.server.debug.port 8089 navigator_debug_port false
Navigator Audit Server Port The port where Navigator Audit Server listens for requests navigator.server.port 7186 navigator_server_port false

Publishing

Display Name Description Related Name Default Value API Name Required
Kafka Topic The name of the Kafka topic where Navigator will publish audit events. NavigatorAuditEvents navigator_kafka_publishing_topic false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Auditing Server in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB navigator_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Navigator Kerberos Principal Kerberos principal used by Navigator to authenticate to all services except HDFS. Note: Navigator should use the principal used by Hue service if you are using MapReduce1 service in any cluster. hue kerberos_role_princ_name true
Navigator TLS/SSL Client Trust Store File The location on disk of the trust store, in .jks format, used to confirm the authenticity of TLS/SSL servers that Navigator might connect to. This is used when Navigator is the client in a TLS/SSL connection. This trust store must contain the certificate(s) used to sign the service(s) connected to. If this parameter is not provided, the default list of well-known certificate authorities is used instead. navigator_truststore_file false
Navigator TLS/SSL Client Trust Store Password The password for the Navigator TLS/SSL Certificate Trust Store File. This password is not required to access the trust store; this field can be left blank. This password provides optional integrity checking of the file. The contents of trust stores are certificates, and certificates are public information. navigator_truststore_password false
Enable TLS/SSL for NAVIGATOR Encrypt communication between clients and NAVIGATOR using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). nav.http.enable_ssl false ssl_enabled false
TLS/SSL Keystore Key Password The password that protects the private key contained in the JKS keystore used when NAVIGATOR is acting as a TLS/SSL server. nav.ssl.keyManagerPassword ssl_server_keystore_keypassword false
TLS/SSL Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when NAVIGATOR is acting as a TLS/SSL server. The keystore must be in JKS format. nav.ssl.keyStorePath ssl_server_keystore_location false
TLS/SSL Keystore File Password The password for the NAVIGATOR JKS keystore file. nav.ssl.keyStorePassword ssl_server_keystore_password false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Navigator Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Navigator Audit Server Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Navigator Audit Server Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Navigator Audit Server Database Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Database Hostname parameter. false role_config_suppression_navigator_database_host true
Suppress Parameter Validation: Navigator Audit Server Database Name Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Database Name parameter. false role_config_suppression_navigator_database_name true
Suppress Parameter Validation: Navigator Audit Server Database Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Database Password parameter. false role_config_suppression_navigator_database_password true
Suppress Parameter Validation: Navigator Audit Server Database Username Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Database Username parameter. false role_config_suppression_navigator_database_user true
Suppress Parameter Validation: Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for db.navigator.properties Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for db.navigator.properties parameter. false role_config_suppression_navigator_db_safety_valve true
Suppress Parameter Validation: Java Configuration Options for Navigator Audit Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Navigator Audit parameter. false role_config_suppression_navigator_java_opts true
Suppress Parameter Validation: Kafka Topic Whether to suppress configuration warnings produced by the built-in parameter validation for the Kafka Topic parameter. false role_config_suppression_navigator_kafka_publishing_topic true
Suppress Parameter Validation: Navigator Audit Server Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_navigator_role_env_safety_valve true
Suppress Parameter Validation: Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Audit Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties parameter. false role_config_suppression_navigator_server_safety_valve true
Suppress Parameter Validation: Navigator TLS/SSL Client Trust Store File Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator TLS/SSL Client Trust Store File parameter. false role_config_suppression_navigator_truststore_file true
Suppress Parameter Validation: Navigator TLS/SSL Client Trust Store Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator TLS/SSL Client Trust Store Password parameter. false role_config_suppression_navigator_truststore_password true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: PII Masking Regular Expression Whether to suppress configuration warnings produced by the built-in parameter validation for the PII Masking Regular Expression parameter. false role_config_suppression_pii_masking_regex true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: TLS/SSL Keystore Key Password Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore Key Password parameter. false role_config_suppression_ssl_server_keystore_keypassword true
Suppress Parameter Validation: TLS/SSL Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: TLS/SSL Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigator_web_metric_collection true

Navigator Metadata Server

Advanced

Display Name Description Related Name Default Value API Name Required
Allow Usage Data Collection Allows Cloudera to collect usage data, including the use of Google Analytics. nav.allow_usage_data true allow_usage_data true
Navigator Metadata Server Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Navigator Metadata Server Install Dir The directory where Navigator Metadata Server is installed. This allows overriding the version packaged with the Cloudera Manager Server. nav_install_dir false
Navigator Metadata Server Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties For advanced use only, a string to be inserted into the client configuration for navigator.client.properties. navigator_client_config_safety_valve false
Java Configuration Options for Navigator Metadata Server These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. navigator_java_opts false
Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties For advanced use only. A string to be inserted into cloudera-navigator.properties for this role only. navigator_safety_valve false
Navigator Metadata Server Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. NAVIGATORMETASERVER_role_env_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true

Cloudera Navigator

Display Name Description Related Name Default Value API Name Required
Enable Audit Collection Enable collection of audit events from the service's roles. navigator.audit.enabled true navigator_audit_enabled false
Audit Event Filter Event filters are defined in a JSON object like the following: { "defaultAction" : ("accept", "discard"), "rules" : [ { "action" : ("accept", "discard"), "fields" : [ { "name" : "fieldName", "match" : "regex" } ] } ] } A filter has a default action and a list of rules, in order of precedence. Each rule defines an action, and a list of fields to match against the audit event. A rule is "accepted" if all the listed field entries match the audit event. At that point, the action declared by the rule is taken. If no rules match the event, the default action is taken. Actions default to "accept" if not defined in the JSON object. The following is the list of fields that can be filtered for NAVMS events:
  • operation: Navigator operation performed e.g. auditReport, savedSearch etc.
  • username: the user performing the action.
  • ipAddress: the IP from where the request originated.
  • allowed: whether the operation was allowed or denied.
navigator.event.filter navigator_audit_event_filter false

Database

Display Name Description Related Name Default Value API Name Required
Navigator Metadata Server Database Hostname Name of the host where the Navigator Metaserver database is running. If the database is not running on its default port, specify the port number using this syntax: 'host:port'. navms.db.host localhost nav_metaserver_database_host false
Navigator Metadata Server Database Name The name of the Navigator Metadata Server database. navms.db.name navms nav_metaserver_database_name true
Navigator Metadata Server Database Password The password for Navigator Metadata Server database user account. navms.db.password nav_metaserver_database_password false
Navigator Metadata Server Database Type Type of database used for Navigator Metadata Server. navms.db.type mysql nav_metaserver_database_type false
Navigator Metadata Server Database Username The username to use to log into the Navigator Metadata Server database. navms.db.user navms nav_metaserver_database_user true

External Authentication

Display Name Description Related Name Default Value API Name Required
Authentication Backend Order The order in which authentication backends are used for authenticating a user. For Cloudera Manager authentication, only users with role 'Full Administrator' and 'Navigator Administrator' are allowed. In addition, users authenticated by Cloudera Manager using external authentication mechanism are not allowed. Navigator will authenticate external users itself and will not rely on Cloudera Manager. nav.auth.backend.order CM_ONLY auth_backend_order true
External Authentication Type The type of external authentication system to use. nav.external.auth.type LDAP external_auth_type true
Cloudera Navigator S3 Lineage AWS Credentials Enable Cloudera Navigator to extract metadata and lineage for data that is written to S3 buckets in this account. nav_extraction_external_account false
LDAP Bind User Distinguished Name Distinguished name of the user to bind as. This is used to connect to LDAP/AD for searching user and group information. This may be left blank if the LDAP server supports anonymous binds. nav.ldap.bind.dn nav_ldap_bind_dn false
LDAP Bind Password The password of the bind user. nav.ldap.bind.pw nav_ldap_bind_pw false
LDAP Distinguished Name Pattern For use with non-Active Directory LDAP systems. This is a pattern used to search for the distinguished name of a user during authentication. Use "{0}" to specify where the username should go, e.g. "uid={0},ou=People". nav.ldap.dn.pattern nav_ldap_dn_pattern false
LDAP Group Search Base A base distinguished name for searching for groups. nav.ldap.group.search.base nav_ldap_group_search_base false
LDAP Group Search Filter For Logged In User A search filter for finding groups that the logged-in user belongs to. Typically, this is (member={0}), where {0} is replaced by the DN of a successfully authenticated user. nav.ldap.group.search.filter nav_ldap_group_search_filter false
LDAP Groups Search Filter A search filter for finding groups based on search term entered by user. The search term entered in the Navigator UI replaces {0} in the search filter. nav.ldap.groups.search.filter (&(objectClass=groupOfNames)(cn=*0*)) nav_ldap_groups_search_filter false
LDAP URL The URL of the LDAP server. The URL must be prefixed with ldap:// or ldaps://. The URL can optionally specify a custom port, for example: ldaps://ldap_server.example.com:1636. Note that usernames and passwords will be transmitted in the clear unless either an ldaps:// URL is used, or "Enable LDAP TLS" is turned on (where available). Also note that encryption must be in use between the client and this service for the same reason.For more detail on the LDAP URL format, see RFC 2255 . A space-separated list of URLs can be entered; in this case the URLs will each be tried in turn until one replies. nav.ldap.url nav_ldap_url false
LDAP User Search Base A base distinguished name for searching for users. This can be used as a fallback mechanism if the DN pattern does not match any user. nav.ldap.user.search.base nav_ldap_user_search_base false
LDAP User Search Filter A search filter for finding users. Typically, this is (uid={0}), where {0} is replaced by the username that was used at the login screen. nav.ldap.user.search.filter nav_ldap_user_search_filter false
Active Directory Domain This parameter is useful when authenticating against an Active Directory server. This value is appended to all usernames before authenticating against AD. For example, if this parameter is set to "my.domain.com", and the user authenticating is "mike", then "mike@my.domain.com" is passed to AD. If this field is unset, the username remains unaltered before being passed to AD. nav.nt_domain nav_nt_domain false
SAML Entity Base URL The Base URL used to construct redirect URLs reported in this server's SP metadata. Leave this blank to let the server calculate the base URL. nav.saml.entity.base_url nav_saml_entity_base_url false
SAML Entity ID The ID that Navigator Metadata Server uses to identify itself to the IDP. This value should be unique to this Navigator Metadata Server installation. nav.saml.entity.id clouderaNavigator nav_saml_entity_id true
Alias of SAML Sign/Encrypt Private Key The alias used to identify the sign/encrypt private key in the SAML keystore. nav.saml.key.alias nav_saml_key_alias false
SAML Sign/Encrypt Private Key Password The password for the sign/encrypt private key in the SAML keystore. nav.saml.key.password nav_saml_key_password false
SAML Keystore Password The password for the SAML keystore. nav.saml.keystore.password nav_saml_keystore_password false
Path to SAML Keystore File The filesystem path to the keystore file containing the SP private key and any necessary public certificates to validate the IDP. nav.saml.keystore.path nav_saml_keystore_path false
SAML Login URL If your IDP does not support SP-initiated SSO (very uncommon), you use a separate login URL, outside of Navigator Metadata Server. Provide that URL here so that Navigator Metadata Server can use it when a user needs to log in. nav.saml.login.url nav_saml_login_url false
Path to SAML IDP Metadata File The filesystem path to the IDP metadata XML file. nav.saml.metadata.path nav_saml_metadata_path false
SAML Attribute Identifier for User Role The URN OID that identifies the user role in the SAML attributes. Only has an effect when 'Attribute'-based role assignment is used. nav.saml.oid.role urn:oid:2.5.4.11 nav_saml_oid_role true
SAML Attribute Identifier for User ID The URN OID that identifies the user ID in the SAML attributes. nav.saml.oid.user urn:oid:0.9.2342.19200300.100.1.1 nav_saml_oid_user true
SAML Response Binding The SAML binding format that the IDP is asked to use when sending authentication responses. nav.saml.response.binding ARTIFACT nav_saml_response_binding true
SAML Attribute Values for Roles The values that appear in the SAML role attribute for each Navigator Metadata Server role. The first value corresponds to the Full Administrator role. The second value corresponds to the User Administrator role. The third value corresponds to the Auditing Viewer role. The fourth value corresponds to the Lineage Viewer role. The fifth value corresponds to the Metadata Administrator role. The sixth value corresponds to the Policy Viewer role. The seventh value corresponds to the Policy Administartor role. To assign more than one role, the attribute can return values separated by a comma, like "role1, role2". nav.saml.role.map admin useradmin auditingviewer lineageviewer metadataadmin policyviewer policyadmin nav_saml_role_map true
SAML Role Assignment Mechanism The mechanism to use for assigning roles to users. 'Attribute' assigns roles based on a SAML attribute. 'Script' assigns roles based on the result of an external script. nav.saml.role.mapper ATTRIBUTE nav_saml_role_mapper true
Path to SAML Role Assignment Script An external script (or binary) to use to assign roles to SAML users. The username is passed as the first command-line argument. You can configure the return codes for the external script on the Roles page. A negative return value indicates a failure. nav.saml.role.script nav_saml_role_script false
Source of User ID in SAML Response Whether the user ID should be obtained from the SAML response NameID field or from an attribute nav.saml.user.source ATTRIBUTE nav_saml_user_source true
Cloudera Telemetry Publisher S3 Bucket The name of the S3 bucket where Cloudera Telemetry Publisher from remote clusters will upload metadata to for Cloudera Navigator. nav_telemetry_bucket_name nav_telemetry_bucket_name false
Cloudera Telemetry Publisher AWS Credentials Enable Cloudera Navigator to extract metadata and lineage from other clusters (e.g., Cloudera Altus) collected via Cloudera Telemetry Publisher. nav_telemetry_external_account false

Extractor Filter

Display Name Description Related Name Default Value API Name Required
HDFS Filter Enable Enable HDFS Filtering. When Enabled, filters out the extraction of the items in the blacklist nav.filter.hdfs.enable false nav_filter_hdfs_enable false
HDFS Filter Blacklist List of paths to be filtered out. The paths can be regular expressions. nav.filter.hdfs.blacklist nav_filter_hdfs_rules false
S3 Filter Default Action Set to Accept to extract all S3 buckets except for the ones blacklisted. Set to Discard to extract only the buckets that are whitelisted. nav.filter.s3.default.action ACCEPT nav_filter_s3_default_action false
S3 Filter Enable Enable S3 Filtering nav.filter.s3.enable false nav_filter_s3_enable false
S3 Filter list List of S3 buckets to be whitelisted or blacklisted. The strings can be regular extressions. nav.filter.s3.list nav_filter_s3_rules false

Logs

Display Name Description Related Name Default Value API Name Required
Audit Log Directory Path to the directory where audit logs will be written. The directory will be created if it doesn't exist. audit_event_log_dir /var/log/cloudera-scm-navigator/audit audit_event_log_dir false
Navigator Metadata Server Logging Threshold The minimum log level for Navigator Metadata Server logs INFO log_threshold false
Navigator Metadata Server Maximum Log File Backups The maximum number of rolled log files to keep for Navigator Metadata Server logs. Typically used by log4j or logback. 10 max_log_backup_index false
Navigator Metadata Server Max Log Size The maximum size, in megabytes, per log file for Navigator Metadata Server logs. Typically used by log4j or logback. 200 MiB max_log_size false
Navigator Metadata Server Log Directory Directory where Navigator Metadata Server will place its log files. /var/log/cloudera-scm-navigator mgmt_log_dir false
Maximum Audit Log File Size Maximum size of audit log file in MB before it is rolled over. navigator.audit_log_max_file_size 100 MiB navigator_audit_log_max_file_size false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Navigator Audit Failure Thresholds The health test thresholds for failures encountered when monitoring audits within a recent period specified by the mgmt_navigator_failure_window configuration for the role. The value that can be specified for this threshold is the number of bytes of audits data that is left to be sent to audit server. mgmt.navigator.failure.thresholds Warning: Never, Critical: Any mgmt_navigator_failure_thresholds false
Monitoring Period For Audit Failures The period to review when checking if audits are blocked and not getting processed. mgmt.navigator.failure.window 20 minute(s) mgmt_navigator_failure_window false
Navigator Audit Pipeline Health Check Enable test of audit events processing pipeline. This will test if audit events are not getting processed by Audit Server for a role that generates audit. mgmt.navigator.status.check.enabled true mgmt_navigator_status_check_enabled false
Audit Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Audit Log Directory. Warning: 10 GiB, Critical: 5 GiB navigatormetaserver_audit_event_log_directory_free_space_absolute_thresholds false
Audit Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Audit Log Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Audit Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never navigatormetaserver_audit_event_log_directory_free_space_percentage_thresholds false
Navigator Metadata Server Storage Dir Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Navigator Metadata Server Storage Dir. Warning: 10 GiB, Critical: 5 GiB navigatormetaserver_data_directory_free_space_absolute_thresholds false
Navigator Metadata Server Storage Dir Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Navigator Metadata Server Storage Dir. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Navigator Metadata Server Storage Dir Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never navigatormetaserver_data_directory_free_space_percentage_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % navigatormetaserver_fd_thresholds false
Navigator Metadata Server Host Health Test When computing the overall Navigator Metadata Server health, consider the host's health. true navigatormetaserver_host_health_enabled false
Navigator Metadata Server Process Health Test Enables the health test that the Navigator Metadata Server's process state is consistent with the role configuration true navigatormetaserver_scm_health_enabled false
Solr Element Count Threshold Threshold for throwing alert when the Solr Element Count reaches Warning: 5.0E8, Critical: 1.0E9 navigatormetaserver_solr_element_count_threshold false
Solr Relation Count Threshold Threshold for throwing alert when the Solr Relation Count reaches Warning: 5.0E8, Critical: 1.0E9 navigatormetaserver_solr_relation_count_threshold false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Navigator Metadata Server Storage Dir The directory where Navigator Metadata Server data is stored. Note that changing this location does not migrate existing data. nav.data.dir /var/lib/cloudera-scm-navigator data_dir false
Default Facets List of metadata properties used by default for Navigator search facets. If no facets are listed, the facets used are some system properties such as "sourceType", "type", "owner", "clusterTemplate", "tags", “deleted”. Your entries here replace these system facets. For example, to include some of the system properties and a managed property "region" in the "sales" namespace, include entries such as "type", "owner", and "sales.region". nav.search.default_facets nav_search_default_facets false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Policies

Display Name Description Related Name Default Value API Name Required
Enable Expression Input Allows policy properties to be specified using Java expressions. nav.policy.expression.enable false nav_policies_expression_input false
JMS Password The password of the JMS user to which notifications of changes to entities affected by policies are sent. navms.jms.password ****** nav_policies_jms_password false
JMS Queue The JMS queue to which notifications of changes to entities affected by policies are sent. navms.jms.queue Navigator nav_policies_jms_queue false
JMS URL The URL of the JMS server to which notifications of changes to entities affected by policies are sent. navms.jms.url tcp://localhost:61616 nav_policies_jms_url false
JMS User The JMS user to which notifications of changes to entities affected by policies are sent. navms.jms.user admin nav_policies_jms_user false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Navigator Metadata Server Port The port where Navigator Metadata Server listens for requests nav.http.port 7187 navigator_server_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Navigator Metadata Server in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 2 GiB navigator_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Navigator Kerberos Principal Kerberos principal used by Navigator to authenticate to all services except HDFS. Note: Navigator should use the principal used by Hue service if you are using MapReduce1 service in any of the clusters. hue kerberos_role_princ_name true
Navigator Kerberos Principal for HDFS Kerberos principal used by Navigator to authenticate to HDFS services. Note: This principal must have administrator and superuser privileges on all HDFS services. hdfs nav_hdfs_kerberos_princ true
Enable TLS/SSL for Navigator Metadata Server Encrypt communication between clients and Navigator Metadata Server using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). nav.http.enable_ssl false ssl_enabled false
TLS/SSL Keystore Key Password The password that protects the private key contained in the JKS keystore used when Navigator Metadata Server is acting as a TLS/SSL server. nav.ssl.keyManagerPassword ssl_server_keystore_keypassword false
TLS/SSL Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when Navigator Metadata Server is acting as a TLS/SSL server. The keystore must be in JKS format. nav.ssl.keyStorePath ssl_server_keystore_location false
TLS/SSL Keystore File Password The password for the Navigator Metadata Server JKS keystore file. nav.ssl.keyStorePassword ssl_server_keystore_password false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Parameter Validation: Audit Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Audit Log Directory parameter. false role_config_suppression_audit_event_log_dir true
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Navigator Metadata Server Storage Dir Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Storage Dir parameter. false role_config_suppression_data_dir true
Suppress Parameter Validation: Navigator Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Navigator Metadata Server Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Navigator Metadata Server Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: HDFS Filter Blacklist Whether to suppress configuration warnings produced by the built-in parameter validation for the HDFS Filter Blacklist parameter. false role_config_suppression_nav_filter_hdfs_rules true
Suppress Parameter Validation: S3 Filter list Whether to suppress configuration warnings produced by the built-in parameter validation for the S3 Filter list parameter. false role_config_suppression_nav_filter_s3_rules true
Suppress Parameter Validation: Navigator Kerberos Principal for HDFS Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Kerberos Principal for HDFS parameter. false role_config_suppression_nav_hdfs_kerberos_princ true
Suppress Parameter Validation: Navigator Metadata Server Install Dir Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Install Dir parameter. false role_config_suppression_nav_install_dir true
Suppress Parameter Validation: LDAP Bind User Distinguished Name Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Bind User Distinguished Name parameter. false role_config_suppression_nav_ldap_bind_dn true
Suppress Parameter Validation: LDAP Bind Password Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Bind Password parameter. false role_config_suppression_nav_ldap_bind_pw true
Suppress Parameter Validation: LDAP Distinguished Name Pattern Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Distinguished Name Pattern parameter. false role_config_suppression_nav_ldap_dn_pattern true
Suppress Parameter Validation: LDAP Group Search Base Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Group Search Base parameter. false role_config_suppression_nav_ldap_group_search_base true
Suppress Parameter Validation: LDAP Group Search Filter For Logged In User Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Group Search Filter For Logged In User parameter. false role_config_suppression_nav_ldap_group_search_filter true
Suppress Parameter Validation: LDAP Groups Search Filter Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP Groups Search Filter parameter. false role_config_suppression_nav_ldap_groups_search_filter true
Suppress Parameter Validation: LDAP URL Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP URL parameter. false role_config_suppression_nav_ldap_url true
Suppress Parameter Validation: LDAP User Search Base Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP User Search Base parameter. false role_config_suppression_nav_ldap_user_search_base true
Suppress Parameter Validation: LDAP User Search Filter Whether to suppress configuration warnings produced by the built-in parameter validation for the LDAP User Search Filter parameter. false role_config_suppression_nav_ldap_user_search_filter true
Suppress Parameter Validation: Navigator Metadata Server Database Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Database Hostname parameter. false role_config_suppression_nav_metaserver_database_host true
Suppress Parameter Validation: Navigator Metadata Server Database Name Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Database Name parameter. false role_config_suppression_nav_metaserver_database_name true
Suppress Parameter Validation: Navigator Metadata Server Database Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Database Password parameter. false role_config_suppression_nav_metaserver_database_password true
Suppress Parameter Validation: Navigator Metadata Server Database Username Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Database Username parameter. false role_config_suppression_nav_metaserver_database_user true
Suppress Parameter Validation: Active Directory Domain Whether to suppress configuration warnings produced by the built-in parameter validation for the Active Directory Domain parameter. false role_config_suppression_nav_nt_domain true
Suppress Parameter Validation: JMS Password Whether to suppress configuration warnings produced by the built-in parameter validation for the JMS Password parameter. false role_config_suppression_nav_policies_jms_password true
Suppress Parameter Validation: JMS Queue Whether to suppress configuration warnings produced by the built-in parameter validation for the JMS Queue parameter. false role_config_suppression_nav_policies_jms_queue true
Suppress Parameter Validation: JMS URL Whether to suppress configuration warnings produced by the built-in parameter validation for the JMS URL parameter. false role_config_suppression_nav_policies_jms_url true
Suppress Parameter Validation: JMS User Whether to suppress configuration warnings produced by the built-in parameter validation for the JMS User parameter. false role_config_suppression_nav_policies_jms_user true
Suppress Parameter Validation: SAML Entity Base URL Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Entity Base URL parameter. false role_config_suppression_nav_saml_entity_base_url true
Suppress Parameter Validation: SAML Entity ID Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Entity ID parameter. false role_config_suppression_nav_saml_entity_id true
Suppress Parameter Validation: Alias of SAML Sign/Encrypt Private Key Whether to suppress configuration warnings produced by the built-in parameter validation for the Alias of SAML Sign/Encrypt Private Key parameter. false role_config_suppression_nav_saml_key_alias true
Suppress Parameter Validation: SAML Sign/Encrypt Private Key Password Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Sign/Encrypt Private Key Password parameter. false role_config_suppression_nav_saml_key_password true
Suppress Parameter Validation: SAML Keystore Password Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Keystore Password parameter. false role_config_suppression_nav_saml_keystore_password true
Suppress Parameter Validation: Path to SAML Keystore File Whether to suppress configuration warnings produced by the built-in parameter validation for the Path to SAML Keystore File parameter. false role_config_suppression_nav_saml_keystore_path true
Suppress Parameter Validation: SAML Login URL Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Login URL parameter. false role_config_suppression_nav_saml_login_url true
Suppress Parameter Validation: Path to SAML IDP Metadata File Whether to suppress configuration warnings produced by the built-in parameter validation for the Path to SAML IDP Metadata File parameter. false role_config_suppression_nav_saml_metadata_path true
Suppress Parameter Validation: SAML Attribute Identifier for User Role Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Attribute Identifier for User Role parameter. false role_config_suppression_nav_saml_oid_role true
Suppress Parameter Validation: SAML Attribute Identifier for User ID Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Attribute Identifier for User ID parameter. false role_config_suppression_nav_saml_oid_user true
Suppress Parameter Validation: SAML Attribute Values for Roles Whether to suppress configuration warnings produced by the built-in parameter validation for the SAML Attribute Values for Roles parameter. false role_config_suppression_nav_saml_role_map true
Suppress Parameter Validation: Path to SAML Role Assignment Script Whether to suppress configuration warnings produced by the built-in parameter validation for the Path to SAML Role Assignment Script parameter. false role_config_suppression_nav_saml_role_script true
Suppress Parameter Validation: Default Facets Whether to suppress configuration warnings produced by the built-in parameter validation for the Default Facets parameter. false role_config_suppression_nav_search_default_facets true
Suppress Parameter Validation: Cloudera Telemetry Publisher S3 Bucket Whether to suppress configuration warnings produced by the built-in parameter validation for the Cloudera Telemetry Publisher S3 Bucket parameter. false role_config_suppression_nav_telemetry_bucket_name true
Suppress Parameter Validation: Audit Event Filter Whether to suppress configuration warnings produced by the built-in parameter validation for the Audit Event Filter parameter. false role_config_suppression_navigator_audit_event_filter true
Suppress Parameter Validation: Navigator Metadata Server Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Client Advanced Configuration Snippet (Safety Valve) for navigator.client.properties parameter. false role_config_suppression_navigator_client_config_safety_valve true
Suppress Parameter Validation: Java Configuration Options for Navigator Metadata Server Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Navigator Metadata Server parameter. false role_config_suppression_navigator_java_opts true
Suppress Parameter Validation: Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Advanced Configuration Snippet (Safety Valve) for cloudera-navigator.properties parameter. false role_config_suppression_navigator_safety_valve true
Suppress Parameter Validation: Navigator Metadata Server Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Navigator Metadata Server Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_navigatormetaserver_role_env_safety_valve true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: TLS/SSL Keystore Key Password Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore Key Password parameter. false role_config_suppression_ssl_server_keystore_keypassword true
Suppress Parameter Validation: TLS/SSL Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: TLS/SSL Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Audit Log Directory Free Space Whether to suppress the results of the Audit Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_audit_event_log_directory_free_space true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_audit_health true
Suppress Health Test: Navigator Metadata Server Storage Dir Free Space Whether to suppress the results of the Navigator Metadata Server Storage Dir Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_data_directory_free_space true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_navigatormetaserver_unexpected_exits true
Suppress Health Test: Solr Element Count Threshold Test Whether to suppress the results of the Solr Element Count Threshold Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_nms_solr_element_count true
Suppress Health Test: Solr Relation Count Threshold Test Whether to suppress the results of the Solr Relation Count Threshold Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_nms_solr_relation_count true

Reports Manager

Advanced

Display Name Description Related Name Default Value API Name Required
Extra Space Ratio for Indexing Reports Manager uses an array to store HDFS directory tree during indexing. The size of this array is 3 * number of filesystem objects in HDFS * (1 + extra space ratio). Increasing this ratio allows Reports Manager to create the directory tree faster, but consumes more memory. Also, extra space ratio must be set to a small enough value so that size of the array is below the maximum allowed in Java, which is 2^31 - 1. index.space.extra.ratio 0.2 headlamp_index_space_extra_ratio false
Index Writer Thread Pool Queue Size Size of the queue to use for holding index writer tasks before they are executed. For faster indexing performance, consider increasing this to a small multiple of the Maximum Index Writer Threads configured value. index.writer.max.queue.size 4 headlamp_index_writer_max_queue_size false
Maximum Index Writer Threads Maximum number of concurrent threads to use when writing the index. For faster indexing performance, consider increasing it to a small multiple of the number of cores on the Reports Manager host. index.writer.num.threads 2 headlamp_index_writer_num_threads false
Java Configuration Options for Reports Manager These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. headlamp_java_opts false
Maximum Document Buffer Size Amount of memory that can be used for buffering documents before they are flushed to the index. For faster indexing performance, consider increasing this value. lucene.max.buffer.size.mb 32 MiB headlamp_lucene_max_buffer_size_mb false
Index Merge Factor Reports Manager index is built in sections that are merged as the build progresses. This configuration determines how often index sections are merged. With smaller values, less memory is used while indexing, but indexing speed is slower. For faster indexing performance, consider increasing this value. lucene.merge.factor 100 headlamp_lucene_merge_factor false
Publish HBase Space Usage When set, publishes HBase space usage metrics to support HBase usage reporting. This feature is only supported for CDH5+ HBase deployments. publish.hbase.space true headlamp_publish_hbase_metrics false
Reports Manager Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.db.properties For advanced use only. A string to be inserted into headlamp.db.properties for this role only. reportsmanager_db_safety_valve false
Reports Manager Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. REPORTSMANAGER_role_env_safety_valve false
Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.conf For advanced use only. A string to be inserted into headlamp.conf for this role only. reportsmanager_safety_valve false

Database

Display Name Description Related Name Default Value API Name Required
Reports Manager Database Hostname Name of the host where Reports Manager's database is running. It is highly recommended that this database is on the same host as Reports Manager. If the database is not running on its default port, specify the port number using this syntax: 'host:port' com.cloudera.headlamp.db.host localhost headlamp_database_host false
Reports Manager Database Name The name of the Reports Manager's database. com.cloudera.headlamp.db.name headlamp_database_name true
Reports Manager Database Password The password for Reports Manager's database user account. com.cloudera.headlamp.db.password headlamp_database_password false
Reports Manager Database Type Type of database used for Reports Manager. com.cloudera.headlamp.db.type mysql headlamp_database_type false
Reports Manager Database Username The username to use to log into Reports Manager's database. com.cloudera.headlamp.db.user headlamp_database_user true

Logs

Display Name Description Related Name Default Value API Name Required
Reports Manager Logging Threshold The minimum log level for Reports Manager logs INFO log_threshold false
Reports Manager Maximum Log File Backups The maximum number of rolled log files to keep for Reports Manager logs. Typically used by log4j or logback. 10 max_log_backup_index false
Reports Manager Max Log Size The maximum size, in megabytes, per log file for Reports Manager logs. Typically used by log4j or logback. 200 MiB max_log_size false
Reports Manager Log Directory Directory where Reports Manager will place its log files. /var/log/cloudera-scm-headlamp mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % reportsmanager_fd_thresholds false
Reports Manager Host Health Test When computing the overall Reports Manager health, consider the host's health. true reportsmanager_host_health_enabled false
Pause Duration Thresholds The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 reportsmanager_pause_duration_thresholds false
Pause Duration Monitoring Period The period to review when computing the moving average of extra time the pause monitor spent paused. 5 minute(s) reportsmanager_pause_duration_window false
Reports Manager Process Health Test Enables the health test that the Reports Manager's process state is consistent with the role configuration true reportsmanager_scm_health_enabled false
Reports Manager Working Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Reports Manager Working Directory. Warning: 10 GiB, Critical: 5 GiB reportsmanager_scratch_directory_free_space_absolute_thresholds false
Reports Manager Working Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Reports Manager Working Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Reports Manager Working Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never reportsmanager_scratch_directory_free_space_percentage_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Cloudera Manager Descriptor Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager descriptor was last refreshed. Warning: 60000.0, Critical: 120000.0 scm_descriptor_age_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Reports Manager Working Directory Directory for Reports Manager to use for its working files scratch.dir /var/lib/cloudera-scm-headlamp headlamp_scratch_dir false
Reports Manager Update Frequency Frequency in which Reports Manager refreshes its view of HDFS. update.frequency.seconds 1 hour(s) headlamp_update_frequency_seconds false
Descriptor Fetch Tries Interval The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting. mgmt.descriptor.fetch.frequency 2 second(s) mgmt_descriptor_fetch_frequency true
Descriptor Fetch Max Tries Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit. mgmt.num.descriptor.fetch.tries 5 mgmt_num_descriptor_fetch_tries true

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Bind Reports Manager to Wildcard Address If enabled, the Reports Manager binds to the wildcard address ("0.0.0.0") on all of its ports. false headlamp_bind_wildcard false
Reports Manager Web UI Port The port where Reports Manager starts a debug web server. Set to -1 to disable debug server. debug.server.port 8083 headlamp_debug_port false
Reports Manager Server Port The port where Reports Manager listens for requests server.port 5678 headlamp_server_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Reports Manager in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB headlamp_heapsize false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Reports Manager Kerberos Principal Kerberos principal used by Reports Manager. Note: This principal must have administrator and superuser privileges on all HDFS services. hdfs kerberos_role_princ_name true

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Reports Manager Database Hostname Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Database Hostname parameter. false role_config_suppression_headlamp_database_host true
Suppress Parameter Validation: Reports Manager Database Name Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Database Name parameter. false role_config_suppression_headlamp_database_name true
Suppress Parameter Validation: Reports Manager Database Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Database Password parameter. false role_config_suppression_headlamp_database_password true
Suppress Parameter Validation: Reports Manager Database Username Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Database Username parameter. false role_config_suppression_headlamp_database_user true
Suppress Parameter Validation: Java Configuration Options for Reports Manager Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Reports Manager parameter. false role_config_suppression_headlamp_java_opts true
Suppress Parameter Validation: Reports Manager Working Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Working Directory parameter. false role_config_suppression_headlamp_scratch_dir true
Suppress Parameter Validation: Reports Manager Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Reports Manager Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Reports Manager Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.db.properties Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.db.properties parameter. false role_config_suppression_reportsmanager_db_safety_valve true
Suppress Parameter Validation: Reports Manager Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_reportsmanager_role_env_safety_valve true
Suppress Parameter Validation: Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Reports Manager Advanced Configuration Snippet (Safety Valve) for headlamp.conf parameter. false role_config_suppression_reportsmanager_safety_valve true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_log_directory_free_space true
Suppress Health Test: Pause Duration Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_pause_duration true
Suppress Health Test: Cloudera Manager Descriptor Age Whether to suppress the results of the Cloudera Manager Descriptor Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_scm_descriptor_fetch true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_scm_health true
Suppress Health Test: Reports Manager Working Directory Free Space Whether to suppress the results of the Reports Manager Working Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_scratch_directory_free_space true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_reports_manager_unexpected_exits true

Service Monitor

Advanced

Display Name Description Related Name Default Value API Name Required
Java Configuration Options for Service Monitor These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. firehose_java_opts false
Service Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf For advanced use only. A string to be inserted into cmon.conf for this role only. firehose_safety_valve false
Service Monitor Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Service Monitor Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. SERVICEMONITOR_role_env_safety_valve false
Event Publication Maximum Queue Size The maximum size of the queue in which events published from this role will be buffered. If this queue becomes full (for example, due to an outage), subsequent events will be dropped. health.event.publish.queue.max 20000 svcmon_event_publication_queue_size_max true
Event Publication Retry Period If an event cannot be delivered immediately by this role, this value controls how long to wait before Event Publisher retries delivery. health.event.publish.retry.ms 5000 svcmon_event_publication_retry_period true

Logs

Display Name Description Related Name Default Value API Name Required
Service Monitor Logging Threshold The minimum log level for Service Monitor logs INFO log_threshold false
Service Monitor Maximum Log File Backups The maximum number of rolled log files to keep for Service Monitor logs. Typically used by log4j or logback. 10 max_log_backup_index false
Service Monitor Max Log Size The maximum size, in megabytes, per log file for Service Monitor logs. Typically used by log4j or logback. 200 MiB max_log_size false
Service Monitor Log Directory Location of log files for Service Monitor /var/log/cloudera-scm-firehose mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Metrics Aggregation Run Duration Thresholds The health test thresholds for monitoring the metrics aggregation run duration. Warning: 10 second(s), Critical: 30 second(s) aggregation_run_duration_thresholds false
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Service Monitor Storage Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Service Monitor Storage Directory. Warning: 10 GiB, Critical: 5 GiB firehose_storage_directory_free_space_absolute_thresholds false
Service Monitor Storage Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Service Monitor Storage Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Service Monitor Storage Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never firehose_storage_directory_free_space_percentage_thresholds false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Cloudera Manager Metric Schema Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager metric schema was last refreshed. Warning: 60000.0, Critical: 120000.0 metric_schema_age_thresholds_name false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Cloudera Manager Descriptor Age Thresholds The health test thresholds for monitoring the time since the Cloudera Manager descriptor was last refreshed. Warning: 60000.0, Critical: 120000.0 scm_descriptor_age_thresholds false
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % servicemonitor_fd_thresholds false
Heap Size Thresholds The health test thresholds for the heap used. Warning: 90.0 %, Critical: 95.0 % servicemonitor_heap_size_thresholds false
Service Monitor Host Health Test When computing the overall Service Monitor health, consider the host's health. true servicemonitor_host_health_enabled false
Pause Duration Thresholds The health test thresholds for the weighted average extra time the pause monitor spent paused. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 servicemonitor_pause_duration_thresholds false
Pause Duration Monitoring Period The period to review when computing the moving average of extra time the pause monitor spent paused. 5 minute(s) servicemonitor_pause_duration_window false
Service Monitor Role Pipeline Monitoring Thresholds The health test thresholds for monitoring the Service Monitor role pipeline. This specifies the number of dropped messages that will be tolerated over the monitoring time period. Warning: Never, Critical: Any servicemonitor_role_pipeline_thresholds false
Service Monitor Role Pipeline Monitoring Time Period The time period over which the Service Monitor role pipeline will be monitored for dropped messages. 5 minute(s) servicemonitor_role_pipeline_window false
Service Monitor Process Health Test Enables the health test that the Service Monitor's process state is consistent with the role configuration true servicemonitor_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true servicemonitor_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never servicemonitor_web_metric_collection_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false
YARN MapReduce Counter Descriptions This JSON document contains metadata that is used by the Service Monitor's YARN application monitoring feature for YARN-based MapReduce counter handling. Each counter description has the following fields:
  • name (mandatory) - the name of the counter, for example, org.apache.hadoop.mapreduce.filesystemcounter.file_bytes_read.
  • units (mandatory) - the units of the counter.
  • attributeName (optional) - the attribute name to use for the counter within Cloudera Manager, this name will be used to identify the counter within the YARN Application Monitoring feature and in the Cloudera Manager API. If not specified the portion of the counter name after the last period will be used.
  • displayName (optional) - a display name for the counter. If not specified the full counter name will be used.
  • description (optional) - a description of the counter. If not specified the full counter name will be used.
[ name: org.apache.hadoop.mapreduce.jobcounter.num_failed_maps, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.num_failed_reduces, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.total_launched_maps, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.total_launched_reduces, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.other_local_maps, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.data_local_maps, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.rack_local_maps, units: tasks , name: org.apache.hadoop.mapreduce.jobcounter.slots_millis_maps, units: ms , name: org.apache.hadoop.mapreduce.jobcounter.slots_millis_reduces, units: ms , name: org.apache.hadoop.mapreduce.jobcounter.fallow_slots_millis_maps, units: ms , name: org.apache.hadoop.mapreduce.jobcounter.fallow_slots_millis_reduces, units: ms , name: org.apache.hadoop.mapreduce.jobcounter.mb_millis_maps, units: mb millis , name: org.apache.hadoop.mapreduce.jobcounter.mb_millis_reduces, units: mb millis , name: org.apache.hadoop.mapreduce.jobcounter.vcores_millis_maps, units: vcore millis , name: org.apache.hadoop.mapreduce.jobcounter.vcores_millis_reduces, units: vcore millis , name: org.apache.hadoop.mapreduce.filesystemcounter.file_bytes_read, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.file_bytes_written, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.file_read_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.file_large_read_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.file_write_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.hdfs_bytes_read, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.hdfs_bytes_written, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.hdfs_read_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.hdfs_large_read_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.hdfs_write_ops, units: operations , name: org.apache.hadoop.mapreduce.filesystemcounter.s3a_bytes_read, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.s3a_bytes_written, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.adl_bytes_read, units: bytes , name: org.apache.hadoop.mapreduce.filesystemcounter.adl_bytes_written, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.map_input_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.map_output_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.map_output_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.map_output_materialized_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.split_raw_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.combine_input_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.combine_output_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.reduce_input_groups, units: groups , name: org.apache.hadoop.mapreduce.taskcounter.reduce_shuffle_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.reduce_input_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.reduce_output_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.spilled_records, units: records , name: org.apache.hadoop.mapreduce.taskcounter.shuffled_maps, units: tasks , name: org.apache.hadoop.mapreduce.taskcounter.failed_shuffle, units: failures , name: org.apache.hadoop.mapreduce.taskcounter.merged_map_outputs, units: outputs , name: org.apache.hadoop.mapreduce.taskcounter.gc_time_millis, units: ms , name: org.apache.hadoop.mapreduce.taskcounter.cpu_milliseconds, units: ms , name: org.apache.hadoop.mapreduce.taskcounter.physical_memory_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.virtual_memory_bytes, units: bytes , name: org.apache.hadoop.mapreduce.taskcounter.committed_heap_bytes, units: bytes , attributeName: shuffle_errors_bad_id, name: shuffle_errors.bad_id, units: errors , attributeName: shuffle_errors_connection, name: shuffle_errors.connection, units: errors , attributeName: shuffle_errors_io, name: shuffle_errors.io_error, units: errors , attributeName: shuffle_errors_wrong_length, name: shuffle_errors.wrong_length, units: errors , attributeName: shuffle_errors_wrong_map, name: shuffle_errors.wrong_map, units: errors , attributeName: shuffle_errors_wrong_reduce, name: shuffle_errors.wrong_reduce, units: errors , name: org.apache.hadoop.mapreduce.lib.input.fileinputformatcounter.bytes_read, units: bytes , name: org.apache.hadoop.mapreduce.lib.output.fileoutputformatcounter.bytes_written, units: bytes ] yarn_application_mapreduce_counters false

Other

Display Name Description Related Name Default Value API Name Required
Use the Authentication Service to enable Single Sign On Use the Authentication Service to enable Single Sign On for the Firehose debug servers. Requires a running Authentication Service. debug.servlet.auth.enabled false debug_servlet_auth_enabled false
Impala Storage The approximate amount of disk space dedicated to storing Impala query data. Once the store has reached its maximum size, older data is deleted to make room for newer queries. The disk usage is approximate because data is deleted only when the limit is reached. firehose_impala_storage_bytes 1 GiB firehose_impala_storage_bytes false
Reports Time-series Storage The approximate amount of disk space dedicated to storing time series for reporting data. Once the store has reached its maximum size, older data is deleted to make room for newer data. The disk usage is approximate because data is deleted only when the limit is reached. See the "Disk Usage" tab on the Service Monitor page for more information on how space is consumed in the Service Monitor. This tab also shows information about the amount of data retained and the time window covered by each data granularity. firehose_reports_storage_bytes 1 GiB firehose_reports_storage_bytes false
Service Monitor Storage Directory The directory where Service Monitor data is stored. The Service Monitor stores metric time series and health information, as well as Impala query and YARN application metadata if Impala and/or YARN are configured. firehose.storage.base.directory /var/lib/cloudera-service-monitor firehose_storage_dir true
Time-Series Storage The approximate amount of disk space dedicated to storing time series and health data. Once the store has reached its maximum size, older data is deleted to make room for newer data. The disk usage is approximate because data is deleted only when the limit is reached.Note that Cloudera Manager stores time-series data at a number of different data granularities, and these granularities have different effective retention periods. Specifically, Cloudera Manager stores metric data as both raw data points and ten-minutely, hourly, six-hourly, daily, and weekly summary data points. Raw data consumes the bulk of the allocated storage space, weekly summaries the least. As such, raw data is retained for the shortest amount of time, while weekly summary points are unlikely to ever be deleted.See the "Storage" tab on the 'Service Monitor' -> 'Charts Library' -> 'Service Monitor Storgae' page for more information on how space is consumed within the Service Monitor. This tab also shows information about the amount of data retained and time window covered by each data granularity. firehose_time_series_storage_bytes 10 GiB firehose_time_series_storage_bytes false
YARN Storage The approximate amount of disk space dedicated to storing YARN application data. Once the store has reached its maximum size, older data is deleted to make room for newer applications. The disk usage is approximate because data is deleted only when the limit is reached. firehose_yarn_storage_bytes 1 GiB firehose_yarn_storage_bytes false
Health Event Startup Policy This setting controls whether health events are emitted when this monitoring role is started. If set to "none", then no health events are emitted. If set to "bad" then health events are emitted for subjects with bad or concerning health. If set to "all" then health events are emitted for all subjects for all health values. The default is "bad". health.event.publish.startup.policy bad health_event_publish_startup_policy false
Descriptor Fetch Tries Interval The interval between fetch tries for SCM descriptor when Cloudera Management Service roles are starting. mgmt.descriptor.fetch.frequency 2 second(s) mgmt_descriptor_fetch_frequency true
Descriptor Fetch Max Tries Maximum number of tries to fetch SCM descriptor when Cloudera Management Service roles are starting. If the roles are not able to get the descriptor in these many tries, then they exit. mgmt.num.descriptor.fetch.tries 5 mgmt_num_descriptor_fetch_tries true
Event Publication Log Quiet Time Period To avoid producing excessive amounts of log output, the Event Publisher component of this role is limited to emitting one message per time period. This value controls the size of that time period. health.event.publish.log.suppress.window.ms 1 minute(s) svcmon_event_publication_log_suppress_window true

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Service Monitor Web UI Port Port for Service Monitor's Debug page. Set to -1 to disable the debug server. debug.servlet.port 8086 firehose_debug_port false
Service Monitor Web UI HTTPS Port Port for Service Monitor's HTTPS Debug page. debug.servlet.https.port 9086 firehose_debug_tls_port false
Service Monitor Listen Port Port where Service Monitor is listening for agent messages. firehose.server.port 9997 firehose_listen_port false
Service Monitor Nozzle Port Port where Service Monitor's query API is exposed. nozzle.server.port 9996 firehose_nozzle_port false
Bind Service Monitor to Wildcard Address If enabled, the Service Monitor binds to the wildcard address ("0.0.0.0") on all of its ports. false smon_bind_wildcard false

Resource Management

Display Name Description Related Name Default Value API Name Required
Java Heap Size of Service Monitor in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB firehose_heapsize false
Maximum Non-Java Memory of Service Monitor The amount of memory the Service Monitor can use off of the Java heap. firehose_non_java_memory_bytes 2 GiB firehose_non_java_memory_bytes false
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true

Security

Display Name Description Related Name Default Value API Name Required
Role-Specific Kerberos Principal Kerberos principal used by the Service Monitor roles. hue kerberos_role_princ_name true
Enable TLS/SSL for Firehose Debug Server Encrypt communication between clients and Firehose Debug Server using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). debug.servlet.https.enabled false ssl_enabled false
Firehose Debug Server TLS/SSL Server JKS Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when Firehose Debug Server is acting as a TLS/SSL server. The keystore must be in JKS format. debug.servlet.https.keystorePath ssl_server_keystore_location false
Firehose Debug Server TLS/SSL Server JKS Keystore File Password The password for the Firehose Debug Server JKS keystore file. debug.servlet.https.keystorePassword ssl_server_keystore_password false

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Java Configuration Options for Service Monitor Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Service Monitor parameter. false role_config_suppression_firehose_java_opts true
Suppress Parameter Validation: Service Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Advanced Configuration Snippet (Safety Valve) for cmon.conf parameter. false role_config_suppression_firehose_safety_valve true
Suppress Configuration Validator: Service Monitor Heap Size Validator Whether to suppress configuration warnings produced by the Service Monitor Heap Size Validator configuration validator. false role_config_suppression_firehose_service_monitor_heap_role_validator true
Suppress Configuration Validator: Service Monitor Off Heap Memory Size Validator Whether to suppress configuration warnings produced by the Service Monitor Off Heap Memory Size Validator configuration validator. false role_config_suppression_firehose_service_monitor_non_java_memory_role_validator true
Suppress Parameter Validation: Service Monitor Storage Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Storage Directory parameter. false role_config_suppression_firehose_storage_dir true
Suppress Parameter Validation: Role-Specific Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Role-Specific Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Service Monitor Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Service Monitor Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Service Monitor Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_servicemonitor_role_env_safety_valve true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: Firehose Debug Server TLS/SSL Server JKS Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Firehose Debug Server TLS/SSL Server JKS Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Parameter Validation: YARN MapReduce Counter Descriptions Whether to suppress configuration warnings produced by the built-in parameter validation for the YARN MapReduce Counter Descriptions parameter. false role_config_suppression_yarn_application_mapreduce_counters true
Suppress Health Test: Metrics Aggregation Run Duration Test Whether to suppress the results of the Metrics Aggregation Run Duration Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_aggregation_run_duration true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_audit_health true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_file_descriptor true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_heap_dump_directory_free_space true
Suppress Health Test: Heap Size Whether to suppress the results of the Heap Size heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_heap_size true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_log_directory_free_space true
Suppress Health Test: Cloudera Manager Metric Schema Age Whether to suppress the results of the Cloudera Manager Metric Schema Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_metric_schema_fetch true
Suppress Health Test: Pause Duration Whether to suppress the results of the Pause Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_pause_duration true
Suppress Health Test: Role Pipeline Whether to suppress the results of the Role Pipeline heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_role_pipeline true
Suppress Health Test: Cloudera Manager Descriptor Age Whether to suppress the results of the Cloudera Manager Descriptor Age heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_scm_descriptor_fetch true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_scm_health true
Suppress Health Test: Service Monitor Storage Directory Free Space Whether to suppress the results of the Service Monitor Storage Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_storage_directory_free_space true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_service_monitor_web_metric_collection true

Service-Wide

Advanced

Display Name Description Related Name Default Value API Name Required
Cloudera Management Service Service Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of all roles in this service except client configuration. mgmt_service_env_safety_valve false
Cloudera Management Service Advanced Configuration Snippet (Safety Valve) for ssl-client.xml For advanced use only, a string to be inserted into ssl-client.xml. This setting currently applies to the Reports Manager only. mgmt_ssl_client_safety_valve false
Small Files Reporting: HDFS Service for Data Staging Data collection for small files analysis requires a data staging area in HDFS. If you enable data collection for small files reporting, this property sets which HDFS service stages the data. nav.smallfiles.hdfs.staging.service.name navigator_small_files_staging_hdfs_service_name false
Small Files Reporting: Enable Data Collection When Small Files Reporting is enabled, Navigator passes additional metadata to the Telemetry Publisher so the data can be used by Cloudera Workload XM (WXM). This additional data allows WXM to identify Impala query performance issues caused when data is organized into small files in HDFS. Enable this option only when Telemetry Publisher is enabled. nav.smallfiles.reporting.enabled false navigator_smallfiles_enabled true
Small Files Reporting: HDFS Staging Location Data collection for small files analysis requires a data staging area in HDFS. If you enable data collection for small files reporting, this property sets the HDFS location where Small Files Reporting data is staged. If the directory doesn't already exist, Navigator creates it using the same credentials it uses for HDFS extraction from this service. nav.smallfiles.hdfs.staging.root.path /user/cloudera/navigator/smallfiles navigator_smallfiles_hdfs_path false
System Group The group that this service's processes should run as. cloudera-scm process_groupname true
System User The user that this service's processes should run as. cloudera-scm process_username true

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Log Event Capture When set, each role identifies important log events and forwards them to Cloudera Manager. true catch_events false
Enable Service Level Health Alerts When set, Cloudera Manager will send alerts when the health of this service reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold false enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Log Event Retry Frequency The frequency in which the log4j event publication appender will retry sending undelivered log events to the Event server, in seconds 30 log_event_retry_frequency false
Activity Monitor Role Health Test When computing the overall MGMT health, consider Activity Monitor's health true mgmt_activitymonitor_health_enabled false
Alert Publisher Role Health Test When computing the overall MGMT health, consider Alert Publisher's health true mgmt_alertpublisher_health_enabled false
Cloudera Manager Server Clock Offset Thresholds The health test thresholds for monitoring the clock offset between the Cloudera Manager Server and the Service Monitor. Warning: 30 second(s), Critical: 1 minute(s) mgmt_clock_offset_with_smon_thresholds false
Command Storage Directory Free Space Monitoring Thresholds The health test thresholds for monitoring the free space on the filesystem that contains the Cloudera Manager Server command storage directory. Warning: 2 GiB, Critical: 1 GiB mgmt_command_storage_directory_free_space_absolute_thresholds false
Embedded Database Free Space Monitoring Thresholds The health test thresholds for monitoring the free space on the volume for the embedded PostgreSQL database optionally running on the Cloudera Manager Server. If the embedded database is not in use, this has no effect. Warning: 2 GiB, Critical: 1 GiB mgmt_embedded_database_free_space_absolute_thresholds false
Event Server Role Health Test When computing the overall MGMT health, consider Event Server's health true mgmt_eventserver_health_enabled false
Host Monitor Role Health Test When computing the overall MGMT health, consider Host Monitor's health true mgmt_hostmonitor_health_enabled false
Navigator Audit Server Role Health Test When computing the overall MGMT health, consider Navigator Audit Server's health true mgmt_navigator_health_enabled false
Navigator Metadata Server Role Health Test When computing the overall MGMT health, consider Navigator Metadata Server's health true mgmt_navigatormetaserver_health_enabled false
Reports Manager Role Health Test When computing the overall MGMT health, consider Reports Manager's health true mgmt_reportsmanager_health_enabled false
Service Monitor Role Health Test When computing the overall MGMT health, consider Service Monitor's health true mgmt_servicemonitor_health_enabled false
Telemetry Publisher Role Health Test When computing the overall MGMT health, consider Telemetry Publisher's health true mgmt_telemetrypublisher_health_enabled false
Service Triggers The configured triggers for this service. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific service.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the followig JSON formatted trigger fires if there are more than 10 DataNodes with more than 500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleType = DataNode and last(fd_open) > 500) DO health:bad", "streamThreshold": 10, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] service_triggers true
Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) For advanced use only, a list of derived configuration properties that will be used by the Service Monitor instead of the default ones. smon_derived_configs_safety_valve false

Other

Display Name Description Related Name Default Value API Name Required
Emit Sensitive Data In Stderr If set, sensitive data, like passwords, are emitted to stderr. false mgmt_emit_sensitive_data_in_stderr true
Minimum Kerberos Ticket Validity Period The minimum Kerberos ticket validity period. The Cloudera Management Servies attempt to log in again only after this minimum period of time has elapsed. tgt.login.validity.period 1 hour(s) tgt_login_validity_period false

Publishing

Display Name Description Related Name Default Value API Name Required
Kafka Service The Kafka service where Navigator will publish audit events. navigator_kafka_publishing_service false

Security

Display Name Description Related Name Default Value API Name Required
TLS/SSL Client Truststore File Location Path to the client truststore file used in HTTPS communication. This truststore contains certificates of trusted servers, or of Certificate Authorities trusted to identify servers. If set, this is used to verify certificates in HTTPS communication with CDH services and the Cloudera Manager Server. If not set, the default Java truststore is used to verify certificates. The contents of this truststore can be modified without restarting the Cloudera Management Service roles. By default, changes to its contents are picked up within ten seconds. ssl.client.truststore.location ssl_client_truststore_location false
Cloudera Manager Server TLS/SSL Client Trust Store Password The password for the Cloudera Manager Server TLS/SSL Certificate Trust Store File. This password is not required to access the trust store; this field can be left blank. This password provides optional integrity checking of the file. The contents of trust stores are certificates, and certificates are public information. ssl.client.truststore.password ssl_client_truststore_password false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: Activity Monitor Count Validator Whether to suppress configuration warnings produced by the Activity Monitor Count Validator configuration validator. false service_config_suppression_activitymonitor_count_validator true
Suppress Configuration Validator: Alert Publisher Count Validator Whether to suppress configuration warnings produced by the Alert Publisher Count Validator configuration validator. false service_config_suppression_alertpublisher_count_validator true
Suppress Configuration Validator: Event Server Count Validator Whether to suppress configuration warnings produced by the Event Server Count Validator configuration validator. false service_config_suppression_eventserver_count_validator true
Suppress Configuration Validator: Host Monitor Count Validator Whether to suppress configuration warnings produced by the Host Monitor Count Validator configuration validator. false service_config_suppression_hostmonitor_count_validator true
Suppress Configuration Validator: Cloudera Management Service Host Colocation Validator Whether to suppress configuration warnings produced by the Cloudera Management Service Host Colocation Validator configuration validator. false service_config_suppression_mgmt_colocation_validator true
Suppress Parameter Validation: Cloudera Management Service Service Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Cloudera Management Service Service Environment Advanced Configuration Snippet (Safety Valve) parameter. false service_config_suppression_mgmt_service_env_safety_valve true
Suppress Parameter Validation: Cloudera Management Service Advanced Configuration Snippet (Safety Valve) for ssl-client.xml Whether to suppress configuration warnings produced by the built-in parameter validation for the Cloudera Management Service Advanced Configuration Snippet (Safety Valve) for ssl-client.xml parameter. false service_config_suppression_mgmt_ssl_client_safety_valve true
Suppress Configuration Validator: Navigator Audit Server Count Validator Whether to suppress configuration warnings produced by the Navigator Audit Server Count Validator configuration validator. false service_config_suppression_navigator_count_validator true
Suppress Parameter Validation: Small Files Reporting: HDFS Staging Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Small Files Reporting: HDFS Staging Location parameter. false service_config_suppression_navigator_smallfiles_hdfs_path true
Suppress Configuration Validator: Navigator Metadata Server Count Validator Whether to suppress configuration warnings produced by the Navigator Metadata Server Count Validator configuration validator. false service_config_suppression_navigatormetaserver_count_validator true
Suppress Parameter Validation: System Group Whether to suppress configuration warnings produced by the built-in parameter validation for the System Group parameter. false service_config_suppression_process_groupname true
Suppress Parameter Validation: System User Whether to suppress configuration warnings produced by the built-in parameter validation for the System User parameter. false service_config_suppression_process_username true
Suppress Configuration Validator: Reports Manager Count Validator Whether to suppress configuration warnings produced by the Reports Manager Count Validator configuration validator. false service_config_suppression_reportsmanager_count_validator true
Suppress Parameter Validation: Service Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Triggers parameter. false service_config_suppression_service_triggers true
Suppress Configuration Validator: Service Monitor Count Validator Whether to suppress configuration warnings produced by the Service Monitor Count Validator configuration validator. false service_config_suppression_servicemonitor_count_validator true
Suppress Parameter Validation: Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Service Monitor Derived Configs Advanced Configuration Snippet (Safety Valve) parameter. false service_config_suppression_smon_derived_configs_safety_valve true
Suppress Parameter Validation: TLS/SSL Client Truststore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the TLS/SSL Client Truststore File Location parameter. false service_config_suppression_ssl_client_truststore_location true
Suppress Parameter Validation: Cloudera Manager Server TLS/SSL Client Trust Store Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Cloudera Manager Server TLS/SSL Client Trust Store Password parameter. false service_config_suppression_ssl_client_truststore_password true
Suppress Configuration Validator: Telemetry Publisher Count Validator Whether to suppress configuration warnings produced by the Telemetry Publisher Count Validator configuration validator. false service_config_suppression_telemetrypublisher_count_validator true
Suppress Health Test: Activity Monitor Health Whether to suppress the results of the Activity Monitor Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_activity_monitor_health true
Suppress Health Test: Alert Publisher Health Whether to suppress the results of the Alert Publisher Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_alert_publisher_health true
Suppress Health Test: Cloudera Manager Server Clock Offset Whether to suppress the results of the Cloudera Manager Server Clock Offset heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_clock_offset_with_smon true
Suppress Health Test: Command Storage Directory Free Space Whether to suppress the results of the Command Storage Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_command_storage_directory_free_space true
Suppress Health Test: Embedded Database Free Space Whether to suppress the results of the Embedded Database Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_embedded_db_free_space true
Suppress Health Test: Event Server Health Whether to suppress the results of the Event Server Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_event_server_health true
Suppress Health Test: Host Monitor Health Whether to suppress the results of the Host Monitor Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_host_monitor_health true
Suppress Health Test: Navigator Audit Server Health Whether to suppress the results of the Navigator Audit Server Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_navigator_health true
Suppress Health Test: Navigator Metadata Server Health Whether to suppress the results of the Navigator Metadata Server Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_navigatormetaserver_health true
Suppress Health Test: Reports Manager Health Whether to suppress the results of the Reports Manager Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_reports_manager_health true
Suppress Health Test: Service Monitor Health Whether to suppress the results of the Service Monitor Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_service_monitor_health true
Suppress Health Test: Telemetry Publisher Health Whether to suppress the results of the Telemetry Publisher Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false service_health_suppression_mgmt_telemetrypublisher_health true

Telemetry Publisher

Advanced

Display Name Description Related Name Default Value API Name Required
Telemetry Publisher Export Period The export period in seconds. export.period 1 minute(s) export_period true
Telemetry Publisher Logging Advanced Configuration Snippet (Safety Valve) For advanced use only, a string to be inserted into log4j.properties for this role only. log4j_safety_valve false
Telemetry Publisher Data Directory Storage for tracking persistent state of the role. data.dir /var/lib/cloudera-scm-telemetrypublisher mgmt_data_dir false
Heap Dump Directory Path to directory where heap dumps are generated when java.lang.OutOfMemoryError error is thrown. This directory is automatically created if it does not exist. If this directory already exists, role user must have write access to this directory. If this directory is shared among multiple roles, it should have 1777 permissions. The heap dump files are created with 600 permissions and are owned by the role user. The amount of free space in this directory should be greater than the maximum Java Process heap size configured for this role. oom_heap_dump_dir /tmp oom_heap_dump_dir false
Dump Heap When Out of Memory When set, generates heap dump file when java.lang.OutOfMemoryError is thrown. true oom_heap_dump_enabled true
Kill When Out of Memory When set, a SIGKILL signal is sent to the role process when java.lang.OutOfMemoryError is thrown. true oom_sigkill_enabled true
Telemetry Publisher Polling Period The extractor polling period in seconds. extractor.poll_period 1 minute(s) poll_period true
Automatically Restart Process When set, this role's process is automatically (and transparently) restarted in the event of an unexpected failure. true process_auto_restart true
Enable Metric Collection Cloudera Manager agent monitors each service and each of its role by publishing metrics to the Cloudera Manager Service Monitor. Setting it to false will stop Cloudera Manager agent from publishing any metric for corresponding service/roles. This is usually helpful for services that generate large amount of metrics which Service Monitor is not able to process. true process_should_monitor true
Java Configuration Options for Telemetry Publisher These arguments will be passed as part of the Java command line. Commonly, garbage collection flags, PermGen, or extra debugging flags would be passed here. telemetrypublisher_java_opts false
Log and Query Redaction Telemetry Publisher recommends and by default requires that Log and Query Redaction be enabled for all CDH clusters. If disabled for any cluster, an alert will be raised during role start. Disable this setting to allow running without redaction. log_query_redaction true telemetrypublisher_log_query_redaction true
Proxy Support for Telemetry Publisher When set, Telemetry Publisher sends telemetry through a proxy server. telemetrypublisher.proxy.enabled false telemetrypublisher_proxy_enabled false
Proxy Password Proxy Server Password. This configuration is used only when proxy support is enabled for Telemetry Publisher. telemetrypublisher.proxy.password telemetrypublisher_proxy_password false
Proxy Port Proxy Server Port. This configuration is used only when proxy support is enabled for Telemetry Publisher. telemetrypublisher.proxy.port telemetrypublisher_proxy_port false
Proxy Server Proxy Server Hostname. This configuration is used only when proxy support is enabled for Telemetry Publisher. telemetrypublisher.proxy.server telemetrypublisher_proxy_server false
Proxy User Proxy Server User. This configuration is used only when proxy support is enabled for Telemetry Publisher. telemetrypublisher.proxy.user telemetrypublisher_proxy_user false
Telemetry Publisher Environment Advanced Configuration Snippet (Safety Valve) For advanced use only, key-value pairs (one on each line) to be inserted into a role's environment. Applies to configurations of this role except client configuration. TELEMETRYPUBLISHER_role_env_safety_valve false
Telemetry Publisher Advanced Configuration Snippet (Safety Valve) for telemetrypublisher.conf For advanced use only. A string to be inserted into telemetrypublisher.conf for this role only. telemetrypublisher_safety_valve false
Telemetry Publisher Thread Pool Size The number of parallel threads used for extractor task execution. extractor.thread_pool_size 10 thread_pool_size true

Logs

Display Name Description Related Name Default Value API Name Required
Telemetry Publisher Logging Threshold The minimum log level for Telemetry Publisher logs INFO log_threshold false
Telemetry Publisher Maximum Log File Backups The maximum number of rolled log files to keep for Telemetry Publisher logs. Typically used by log4j or logback. 10 max_log_backup_index false
Telemetry Publisher Max Log Size The maximum size, in megabytes, per log file for Telemetry Publisher logs. Typically used by log4j or logback. 200 MiB max_log_size false
Telemetry Publisher Log Directory Directory where Telemetry Publisher will place its log files. /var/log/cloudera-scm-telemetrypublisher mgmt_log_dir false

Monitoring

Display Name Description Related Name Default Value API Name Required
Enable Health Alerts for this Role When set, Cloudera Manager will send alerts when the health of this role reaches the threshold specified by the EventServer setting eventserver_health_events_alert_threshold true enable_alerts false
Enable Configuration Change Alerts When set, Cloudera Manager will send alerts when this entity's configuration changes. false enable_config_alerts false
Heap Dump Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Warning: 10 GiB, Critical: 5 GiB heap_dump_directory_free_space_absolute_thresholds false
Heap Dump Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's heap dump directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Heap Dump Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never heap_dump_directory_free_space_percentage_thresholds false
Log Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Warning: 10 GiB, Critical: 5 GiB log_directory_free_space_absolute_thresholds false
Log Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's log directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Log Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never log_directory_free_space_percentage_thresholds false
Rules to Extract Events from Log Files This file contains the rules that govern how log messages are turned into events by the custom log4j appender that this role loads. It is in JSON format, and is composed of a list of rules. Every log message is evaluated against each of these rules in turn to decide whether or not to send an event for that message. If a log message matches multiple rules, the first matching rule is used.. Each rule has some or all of the following fields:
  • alert - whether or not events generated from this rule should be promoted to alerts. A value of "true" will cause alerts to be generated. If not specified, the default is "false".
  • rate (mandatory) - the maximum number of log messages matching this rule that can be sent as events every minute. If more than rate matching log messages are received in a single minute, the extra messages are ignored. If rate is less than 0, the number of messages per minute is unlimited.
  • periodminutes - the number of minutes during which the publisher will only publish rate events or fewer. If not specified, the default is one minute
  • threshold - apply this rule only to messages with this log4j severity level or above. An example is "WARN" for warning level messages or higher.
  • content - match only those messages for which contents match this regular expression.
  • exceptiontype - match only those messages that are part of an exception message. The exception type must match this regular expression.
Example:
  • {"alert": false, "rate": 10, "exceptiontype": "java.lang.StringIndexOutOfBoundsException"}This rule sends events to Cloudera Manager for every StringIndexOutOfBoundsException, up to a maximum of 10 every minute.
  • {"alert": false, "rate": 1, "periodminutes": 1, "exceptiontype": ".*"}, {"alert": true, "rate": 1, "periodminutes": 1, "threshold":"ERROR"}In this example, an event generated may not be promoted to alert if an exception is in the ERROR log message, because the first rule with alert = false will match.
version: 0, rules: [ alert: false, rate: 1, periodminutes: 1, threshold: FATAL , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Instead, use .* , alert: false, rate: 0, threshold: WARN, content: .* is deprecated. Use .* instead , alert: false, rate: 1, periodminutes: 2, exceptiontype: .* , alert: false, rate: 1, periodminutes: 1, threshold: WARN ] log_event_whitelist false
Process Swap Memory Thresholds The health test thresholds on the swap memory usage of the process. This takes precedence over the host level threshold. Warning: 200 B, Critical: Never process_swap_memory_thresholds false
Role Triggers The configured triggers for this role. This is a JSON-formatted list of triggers. These triggers are evaluated as part as the health system. Every trigger expression is parsed, and if the trigger condition is met, the list of actions provided in the trigger expression is executed. Each trigger has the following fields:
  • triggerName (mandatory) - The name of the trigger. This value must be unique for the specific role.
  • triggerExpression (mandatory) - A tsquery expression representing the trigger.
  • streamThreshold (optional) - The maximum number of streams that can satisfy a condition of a trigger before the condition fires. By default set to 0, and any stream returned causes the condition to fire.
  • enabled (optional) - By default set to 'true'. If set to 'false', the trigger is not evaluated.
  • expressionEditorConfig (optional) - Metadata for the trigger editor. If present, the trigger should only be edited from the Edit Trigger page; editing the trigger here can lead to inconsistencies.
For example, the following JSON formatted trigger configured for a DataNode fires if the DataNode has more than 1500 file descriptors opened:[{"triggerName": "sample-trigger", "triggerExpression": "IF (SELECT fd_open WHERE roleName=$ROLENAME and last(fd_open) > 1500) DO health:bad", "streamThreshold": 0, "enabled": "true"}]See the trigger rules documentation for more details on how to write triggers using tsquery.The JSON format is evolving and may change and, as a result, backward compatibility is not guaranteed between releases.
[] role_triggers true
Telemetry Publisher Data Directory Free Space Monitoring Absolute Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Telemetry Publisher Data Directory. Warning: 10 GiB, Critical: 5 GiB telemetrypublisher_data_directory_free_space_absolute_thresholds false
Telemetry Publisher Data Directory Free Space Monitoring Percentage Thresholds The health test thresholds for monitoring of free space on the filesystem that contains this role's Telemetry Publisher Data Directory. Specified as a percentage of the capacity on that filesystem. This setting is not used if a Telemetry Publisher Data Directory Free Space Monitoring Absolute Thresholds setting is configured. Warning: Never, Critical: Never telemetrypublisher_data_directory_free_space_percentage_thresholds false
Metrics Data Export Failure Thresholds The health test thresholds for monitoring the data export failure count. Warning: 3.0 time(s), Critical: 5.0 time(s) telemetrypublisher_data_export_failure_thresholds true
Telemetry Publisher Data Export Monitoring Time Period The time period over which the telemetry publisher data export for streams will be monitored for failed export. 5 minute(s) telemetrypublisher_data_export_failure_window true
Metrics Data Ingest Failure Thresholds The health test thresholds for monitoring the data ingest failure count. Warning: 3.0 time(s), Critical: 5.0 time(s) telemetrypublisher_data_ingest_failure_thresholds true
Telemetry Publisher Data Ingest Monitoring Time Period The time period over which the telemetry publisher data ingest for streams will be monitored for failed injest. 5 minute(s) telemetrypublisher_data_ingest_failure_window true
File Descriptor Monitoring Thresholds The health test thresholds of the number of file descriptors used. Specified as a percentage of file descriptor limit. Warning: 50.0 %, Critical: 70.0 % telemetrypublisher_fd_thresholds false
Garbage Collection Duration Thresholds The health test thresholds for the weighted average time spent in Java garbage collection. Specified as a percentage of elapsed wall clock time. Warning: 30.0, Critical: 60.0 telemetrypublisher_gc_duration_thresholds false
Garbage Collection Duration Monitoring Period The period to review when computing the moving average of garbage collection time. 5 minute(s) telemetrypublisher_gc_duration_window false
Telemetry Publisher Host Health Test When computing the overall Telemetry Publisher health, consider the host's health. true telemetrypublisher_host_health_enabled false
Telemetry Publisher Process Health Test Enables the health test that the Telemetry Publisher's process state is consistent with the role configuration true telemetrypublisher_scm_health_enabled false
Web Metric Collection Enables the health test that the Cloudera Manager Agent can successfully contact and gather metrics from the web server. true telemetrypublisher_web_metric_collection_enabled false
Web Metric Collection Duration The health test thresholds on the duration of the metrics request to the web server. Warning: 10 second(s), Critical: Never telemetrypublisher_web_metric_collection_thresholds false
Unexpected Exits Thresholds The health test thresholds for unexpected exits encountered within a recent period specified by the unexpected_exits_window configuration for the role. Warning: Never, Critical: Any unexpected_exits_thresholds false
Unexpected Exits Monitoring Period The period to review when computing unexpected exits. 5 minute(s) unexpected_exits_window false

Other

Display Name Description Related Name Default Value API Name Required
Telemetry Publisher Web UI IPaddress. The IP where Telemetry Publisher starts a debug web server. telemetry_publisher.debug.server.interface 0.0.0.0 telemetry_publisher_debug_server_interface false

Performance

Display Name Description Related Name Default Value API Name Required
Maximum Process File Descriptors If configured, overrides the process soft and hard rlimits (also called ulimits) for file descriptors to the configured value. rlimit_fds false

Ports and Addresses

Display Name Description Related Name Default Value API Name Required
Telemetry Publisher Web UI Port. The port where Telemetry Publisher starts a debug web server. Set to -1 to disable debug server. telemetry_publisher.debug.port 10111 telemetry_publisher_debug_port false
Telemetry Publisher Server Port The port where Telemetry Publisher listens for requests telemetry_publisher.server.port 10110 telemetry_publisher_server_port false

Resource Management

Display Name Description Related Name Default Value API Name Required
Cgroup CPU Shares Number of CPU shares to assign to this role. The greater the number of shares, the larger the share of the host's CPUs that will be given to this role when the host experiences CPU contention. Must be between 2 and 262144. Defaults to 1024 for processes not managed by Cloudera Manager. cpu.shares 1024 rm_cpu_shares true
Cgroup I/O Weight Weight for the read I/O requests issued by this role. The greater the weight, the higher the priority of the requests when the host experiences I/O contention. Must be between 100 and 1000. Defaults to 1000 for processes not managed by Cloudera Manager. blkio.weight 500 rm_io_weight true
Cgroup Memory Hard Limit Hard memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.limit_in_bytes -1 MiB rm_memory_hard_limit true
Cgroup Memory Soft Limit Soft memory limit to assign to this role, enforced by the Linux kernel. When the limit is reached, the kernel will reclaim pages charged to the process if and only if the host is facing memory pressure. If reclaiming fails, the kernel may kill the process. Both anonymous as well as page cache pages contribute to the limit. Use a value of -1 B to specify no limit. By default processes not managed by Cloudera Manager will have no limit. memory.soft_limit_in_bytes -1 MiB rm_memory_soft_limit true
Java Heap Size of TelemetryPublisher in Bytes Maximum size in bytes for the Java Process heap memory. Passed to Java -Xmx. 1 GiB telemetry_publisher_heapsize false

Security

Display Name Description Related Name Default Value API Name Required
Telemetry Kerberos Principal Kerberos principal used by Telemetry Publisher to authenticate to all services except HDFS. Note: Telemetry should use the principal used by Hue service if you are using MapReduce1 service in any of the clusters. hue kerberos_role_princ_name true
Enable TLS/SSL for Telemetry Publisher Encrypt communication between clients and Telemetry Publisher using Transport Layer Security (TLS) (formerly known as Secure Socket Layer (SSL)). telemetrypublisher.http.enable_ssl false ssl_enabled false
Telemetry Publisher TLS/SSL Server JKS Keystore Key Password The password that protects the private key contained in the JKS keystore used when Telemetry Publisher is acting as a TLS/SSL server. telemetrypublisher.ssl.keyManagerPassword ssl_server_keystore_keypassword false
Telemetry Publisher TLS/SSL Server JKS Keystore File Location The path to the TLS/SSL keystore file containing the server certificate and private key used for TLS/SSL. Used when Telemetry Publisher is acting as a TLS/SSL server. The keystore must be in JKS format. telemetrypublisher.ssl.keyStore ssl_server_keystore_location false
Telemetry Publisher TLS/SSL Server JKS Keystore File Password The password for the Telemetry Publisher JKS keystore file. telemetrypublisher.ssl.keyStorePassword ssl_server_keystore_password false
Telemetry Kerberos Principal for HDFS Kerberos principal used by Telemetry Publisher to authenticate to HDFS services. Note: This principal must be in the same groups as the principals used by Job History and Spark History Servers. telemetrypublisher.dfs.user hdfs tp_hdfs_kerberos_princ true

Stacks Collection

Display Name Description Related Name Default Value API Name Required
Stacks Collection Data Retention The amount of stacks data that is retained. After the retention limit is reached, the oldest data is deleted. stacks_collection_data_retention 100 MiB stacks_collection_data_retention false
Stacks Collection Directory The directory in which stacks logs are placed. If not set, stacks are logged into a stacks subdirectory of the role's log directory. stacks_collection_directory stacks_collection_directory false
Stacks Collection Enabled Whether or not periodic stacks collection is enabled. stacks_collection_enabled false stacks_collection_enabled true
Stacks Collection Frequency The frequency with which stacks are collected. stacks_collection_frequency 5.0 second(s) stacks_collection_frequency false
Stacks Collection Method The method used to collect stacks. The jstack option involves periodically running the jstack command against the role's daemon process. The servlet method is available for those roles that have an HTTP server endpoint exposing the current stacks traces of all threads. When the servlet method is selected, that HTTP endpoint is periodically scraped. stacks_collection_method jstack stacks_collection_method false

Suppressions

Display Name Description Related Name Default Value API Name Required
Suppress Configuration Validator: CDH Version Validator Whether to suppress configuration warnings produced by the CDH Version Validator configuration validator. false role_config_suppression_cdh_version_validator true
Suppress Parameter Validation: Telemetry Kerberos Principal Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Kerberos Principal parameter. false role_config_suppression_kerberos_role_princ_name true
Suppress Parameter Validation: Telemetry Publisher Logging Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Logging Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_log4j_safety_valve true
Suppress Parameter Validation: Rules to Extract Events from Log Files Whether to suppress configuration warnings produced by the built-in parameter validation for the Rules to Extract Events from Log Files parameter. false role_config_suppression_log_event_whitelist true
Suppress Parameter Validation: Telemetry Publisher Data Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Data Directory parameter. false role_config_suppression_mgmt_data_dir true
Suppress Parameter Validation: Telemetry Publisher Log Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Log Directory parameter. false role_config_suppression_mgmt_log_dir true
Suppress Parameter Validation: Heap Dump Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Heap Dump Directory parameter. false role_config_suppression_oom_heap_dump_dir true
Suppress Parameter Validation: Role Triggers Whether to suppress configuration warnings produced by the built-in parameter validation for the Role Triggers parameter. false role_config_suppression_role_triggers true
Suppress Parameter Validation: Telemetry Publisher TLS/SSL Server JKS Keystore Key Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher TLS/SSL Server JKS Keystore Key Password parameter. false role_config_suppression_ssl_server_keystore_keypassword true
Suppress Parameter Validation: Telemetry Publisher TLS/SSL Server JKS Keystore File Location Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher TLS/SSL Server JKS Keystore File Location parameter. false role_config_suppression_ssl_server_keystore_location true
Suppress Parameter Validation: Telemetry Publisher TLS/SSL Server JKS Keystore File Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher TLS/SSL Server JKS Keystore File Password parameter. false role_config_suppression_ssl_server_keystore_password true
Suppress Parameter Validation: Stacks Collection Directory Whether to suppress configuration warnings produced by the built-in parameter validation for the Stacks Collection Directory parameter. false role_config_suppression_stacks_collection_directory true
Suppress Parameter Validation: Telemetry Publisher Web UI IPaddress. Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Web UI IPaddress. parameter. false role_config_suppression_telemetry_publisher_debug_server_interface true
Suppress Parameter Validation: Java Configuration Options for Telemetry Publisher Whether to suppress configuration warnings produced by the built-in parameter validation for the Java Configuration Options for Telemetry Publisher parameter. false role_config_suppression_telemetrypublisher_java_opts true
Suppress Parameter Validation: Proxy Password Whether to suppress configuration warnings produced by the built-in parameter validation for the Proxy Password parameter. false role_config_suppression_telemetrypublisher_proxy_password true
Suppress Parameter Validation: Proxy Server Whether to suppress configuration warnings produced by the built-in parameter validation for the Proxy Server parameter. false role_config_suppression_telemetrypublisher_proxy_server true
Suppress Parameter Validation: Proxy User Whether to suppress configuration warnings produced by the built-in parameter validation for the Proxy User parameter. false role_config_suppression_telemetrypublisher_proxy_user true
Suppress Parameter Validation: Telemetry Publisher Environment Advanced Configuration Snippet (Safety Valve) Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Environment Advanced Configuration Snippet (Safety Valve) parameter. false role_config_suppression_telemetrypublisher_role_env_safety_valve true
Suppress Parameter Validation: Telemetry Publisher Advanced Configuration Snippet (Safety Valve) for telemetrypublisher.conf Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Publisher Advanced Configuration Snippet (Safety Valve) for telemetrypublisher.conf parameter. false role_config_suppression_telemetrypublisher_safety_valve true
Suppress Parameter Validation: Telemetry Kerberos Principal for HDFS Whether to suppress configuration warnings produced by the built-in parameter validation for the Telemetry Kerberos Principal for HDFS parameter. false role_config_suppression_tp_hdfs_kerberos_princ true
Suppress Health Test: Data Export Test For Stream Hive-Query-Audits Whether to suppress the results of the Data Export Test For Stream Hive-Query-Audits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_hive__query__audits_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Hive-Query-Audits Whether to suppress the results of the Data Ingest Test For Stream Hive-Query-Audits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_hive__query__audits_data_ingest_failure true
Suppress Health Test: Data Export Test For Stream Impala-Query-Profile Whether to suppress the results of the Data Export Test For Stream Impala-Query-Profile heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_impala__query__profile_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Impala-Query-Profile Whether to suppress the results of the Data Ingest Test For Stream Impala-Query-Profile heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_impala__query__profile_data_ingest_failure true
Suppress Health Test: Data Export Test For Stream Oozie-Workflows Whether to suppress the results of the Data Export Test For Stream Oozie-Workflows heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_oozie__workflows_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Oozie-Workflows Whether to suppress the results of the Data Ingest Test For Stream Oozie-Workflows heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_oozie__workflows_data_ingest_failure true
Suppress Health Test: Data Export Test For Stream Spark2_on_yarn-Event-Log Whether to suppress the results of the Data Export Test For Stream Spark2_on_yarn-Event-Log heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_spark2_on_yarn__event__log_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Spark2_on_yarn-Event-Log Whether to suppress the results of the Data Ingest Test For Stream Spark2_on_yarn-Event-Log heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_spark2_on_yarn__event__log_data_ingest_failure true
Suppress Health Test: Audit Pipeline Test Whether to suppress the results of the Audit Pipeline Test heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_audit_health true
Suppress Health Test: Telemetry Publisher Data Directory Free Space Whether to suppress the results of the Telemetry Publisher Data Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_data_directory_free_space true
Suppress Health Test: File Descriptors Whether to suppress the results of the File Descriptors heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_file_descriptor true
Suppress Health Test: GC Duration Whether to suppress the results of the GC Duration heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_gc_duration true
Suppress Health Test: Heap Dump Directory Free Space Whether to suppress the results of the Heap Dump Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_heap_dump_directory_free_space true
Suppress Health Test: Host Health Whether to suppress the results of the Host Health heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_host_health true
Suppress Health Test: Log Directory Free Space Whether to suppress the results of the Log Directory Free Space heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_log_directory_free_space true
Suppress Health Test: Process Status Whether to suppress the results of the Process Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_scm_health true
Suppress Health Test: Swap Memory Usage Whether to suppress the results of the Swap Memory Usage heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_swap_memory_usage true
Suppress Health Test: Unexpected Exits Whether to suppress the results of the Unexpected Exits heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_unexpected_exits true
Suppress Health Test: Web Server Status Whether to suppress the results of the Web Server Status heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_telemetrypublisher_web_metric_collection true
Suppress Health Test: Data Export Test For Stream Yarn-Apps Whether to suppress the results of the Data Export Test For Stream Yarn-Apps heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_yarn__apps_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Yarn-Apps Whether to suppress the results of the Data Ingest Test For Stream Yarn-Apps heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_yarn__apps_data_ingest_failure true
Suppress Health Test: Data Export Test For Stream Yarn-Jhist Whether to suppress the results of the Data Export Test For Stream Yarn-Jhist heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_yarn__jhist_data_export_failure true
Suppress Health Test: Data Ingest Test For Stream Yarn-Jhist Whether to suppress the results of the Data Ingest Test For Stream Yarn-Jhist heath test. The results of suppressed health tests are ignored when computing the overall health of the associated host, role or service, so suppressed health tests will not generate alerts. false role_health_suppression_yarn__jhist_data_ingest_failure true