Chapter 4. Log Aggregation for Long-running Applications

This chapter describes how to use log aggregation to collect logs for long-running YARN applications.

In Hadoop logs are collected for an application only when it finishes running. This works well for applications that only run for a short time, but is not ideal for long-running applications such as Storm and HBase running on YARN using Slider.

If an application runs for days or weeks, it is useful to collect log information while the application is running. This log information can be used to:

  • Debug hardware and software issues

  • Review and optimize application performance

  • Review application resource utilization

A long-running application may generate a large amount of log information. The application can use log rotation to prevent the log file from becoming excessively large. The YARN NodeManager will periodically aggregate the latest completed log files and make them accessible.


loading table of contents...