Chapter 7. Configuring HDFS Compression
This section describes how to configure HDFS compression on Linux.
Linux supports GzipCodec
,
DefaultCodec
, BZip2Codec
,
LzoCodec
, and SnappyCodec
. Typically,
GzipCodec
is used for HDFS compression. Use the following
instructions to use GZipCodec
.
Option I: To use
GzipCodec
with a one-time only job:hadoop jar hadoop-examples-1.1.0-SNAPSHOT.jar sort sbr"-Dmapred.compress.map.output=true" sbr"-Dmapred.map.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr "-Dmapred.output.compress=true" sbr"-Dmapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec"sbr -outKey org.apache.hadoop.io.Textsbr -outValue org.apache.hadoop.io.Text input output
Option II: To enable
GzipCodec
as the default compression:Edit the
core-site.xml
file on the NameNode host machine:<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo.LzoCodec, org.apache.hadoop.io.compress.SnappyCodec</value> <description>A list of the compression codec classes that can be used for compression/decompression.</description> </property>
Edit the
mapred-site.xml
file on the JobTracker host machine:<property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.type</name> <value>BLOCK</value> </property>
(Optional) - Enable the following two configuration parameters to enable job output compression. Edit the
mapred-site.xml
file on the Resource Manager host machine:<property> <name>mapreduce.output.fileoutputformat.compress</name> <value>true</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property>
Restart the cluster using the applicable commands in Controlling HDP Services Manually.