Log Aggregation File Controllers

By default, log aggregation supports two file controllers: TFIile and IFile. You can also add your own custom file controller.

TFile

TFile is the legacy file controller in YARN. It is reliable and well tested. Its buffer and chunk sizes are configurable. It deletes the log files during rolling sessions, meaning that long running applications do not store all of their logs locally. However, it creates a single file for each log aggregation session.

TFile provides the following features:
  • Block compression

  • Named metadata blocks

  • Sorted or unsorted keys

  • Seek by key or by file offset

IFile

IFile is a newer file controller than TFile. It also uses TFile internally so it provides the same features as TFile.

In an IFile the files are indexed so it is faster to search in the aggregated log file than in a regular TFile. It uses checksums and temporary files which help to prevent failures. Its buffer sizes and rollover file size are configurable on top of the configuration options of TFile.