A free, open source compression library. LZO compression provides a good balance between data size and speed of compression. The LZO compression algorithm is the most efficient of the
codecs, using very little CPU. Its compression ratios are not as good as others, but its compression is still significant compared to the uncompressed file sizes. Unlike some other formats, LZO
compressed files are splittable, enabling MapReduce to process splits in parallel.
LZO is published under the GNU General Public License and so is not included in CDH but can be used with CDH components; the Cloudera public Git repository hosts the hadoop-lzo package that provides a version of LZO that can be used with CDH.