Log aggregation file controllers

By default, log aggregation supports two file controllers, TFile and IFile. You can also add your own custom file controller.

By default IFile is used to write the aggregated logs.

TFile

TFile is the legacy file controller in YARN. It is reliable and well tested. Its buffer and chunk sizes are configurable.

TFile provides the following features:
  • Block compression

  • Named metadata blocks

  • Sorted or unsorted keys

  • Seek by key or by file offset

IFile

IFile is a newer file controller than TFile. It also uses TFile internally so it provides the same features as TFile.

In an IFile the files are indexed so it is faster to search in the aggregated log file than in a regular TFile. It uses checksums and temporary files which help to prevent failures. Its buffer sizes and rollover file size are configurable on top of the configuration options of TFile.