1. Cluster Monitoring Sources

Using proven open-source monitoring systems including Ganglia and Nagios, Ambari gathers information on the status of both of the hosts and the services that run on them, including the status of any jobs running on those services.

  • Host and System Information: Ambari monitors basic host and system information such as CPU utilization, disk I/O bandwidth and operations per second, average memory and swap space utilization, and average network latency.

  • Service Information: Ambari monitors the health and performance status of each service by presenting information generated by that service. Because services that run in master/slave configurations (HDFS, MapReduce, and HBase) are fault tolerant in regard to service slaves, master information is presented individually, whereas slave information is presented largely in aggregate.

  • Job Information: Ambari monitors job status by using information from MapReduce's JobTracker, which produces detailed data both on job status, current and historical, and on job scheduling.

  • Alert Information: Using Nagios with Hadoop-specific plugins and configurations, Ambari Web can issue alerts based on service states defined on three basic levels:

    • OK

    • Warning

    • Critical

    The thresholds for these alerts can be tuned using configuration files, and new alerts can be added. See Using Nagios with Hadoop for more information.


loading table of contents...