Apache Ambari User Guide
Also available as:
PDF
loading table of contents...

Alert Types

Alert thresholds and the threshold units are dependent on alert type. The following table lists the types of alerts, their possible status and if the thresholds are configurable:

Alert Type

Description

Threshold Units

WEB

Connects to a Web URL. Alert status is based on the HTTP response code.

seconds

PORT

Connects to a port. Alert status is based on response time.

seconds

METRIC

Checks the value of a service metric. Units vary, based on the metric being checked.

varies

AGGREGATE

Aggregates the status for another alert.

%

SCRIPT

Executes a script to handle the alert check.

varies

SERVER

Executes a server-side runnable class to handle the alert check.

varies

RECOVERY

Ambari Agents handle the check for process restarts after terminating unexpectedly.

varies

WEB Alert Type

WEB alerts watch a Web URL on a given component and the alert status is determined based on the HTTP response code. Therefore, you cannot change what HTTP response codes determine the thresholds for WEB alerts. You can customize the response text for each thresholds and the overall web connection timeout. A connection timeout is considered a CRITICAL alert. The response code and corresponding status for WEB alerts:

  • OK status if Web URL responds with code under 400.

  • WARNING status if Web URL responds with code 400 and above.

  • CRITICAL status if Ambari cannot connect to Web URL.

PORT Alert Type

PORT alerts check the response time to connect to a given a port and the threshold units are based on seconds.

METRIC Alert Type

METRIC alerts check the value of a single or multiple metrics (if a calculation is performed). The metric is accessed from a URL endpoint available on a given component. A connection timeout is considered a CRITICAL alert. The thresholds are adjustable and the units for each threshold are metric-dependent. For example, in the case of “CPU utilization” alerts, the unit is “%”. And in the case of “RPC latency” alerts, the unit is “milliseconds (ms)”.

AGGREGATE Alert Type

AGGREGATE alerts aggregate the alert status as a percentage of the alert instances affected. For example, the “Percent DataNode Process” alert aggregates the “DataNode Process” alert. The threshold units are “%”.

SCRIPT Alert Type

SCRIPT alerts execute a script and the script determines status such as OK, WARNING or CRITICAL. You can customize the response text and values for the various properties and thresholds for the SCRIPT alert.

SERVER Alert Type

SERVER alerts execute a server-side runnable class which determines the alert status such as OK, WARNING or CRITICAL.

RECOVERY Alert Type

RECOVERY alerts are handled by the Ambari Agents that are watching for process restarts. The alert status such as OK, WARNING and CRITICAL are based on the number of times a process is being restarted automatically. This is useful to know in cases where processes are terminating and Ambari is automatically restarting.