Enable GZipCodec as the default compression codec
For the MapReduce framework, update relevant properties in
      core-site.xml and mapred-site.xml to enable
      GZipCodec as the default compression codec.
- 
        Edit the 
core-site.xmlfile on the NameNode host machine.<property> <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec, org.apache.hadoop.io.compress.DefaultCodec,com.hadoop.compression.lzo. LzoCodec,org.apache.hadoop.io.compress.SnappyCodec</value> <description>A list of the compression codec classes that can be used for compression/decompression.</description> </property> - 
        Edit the 
mapred-site.xmlfile on the JobTracker host machine.<property> <name>mapreduce.map.output.compress</name> <value>true</value> </property> <property> <name>mapreduce.map.output.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.type</name> <value>BLOCK</value> </property> - Optional: 
        Enable the following two configuration parameters to enable job output compression.
          Edit the 
mapred-site.xmlfile on the Resource Manager host machine.<property> <name>mapreduce.output.fileoutputformat.compress</name> <value>true</value> </property> <property> <name>mapreduce.output.fileoutputformat.compress.codec</name> <value>org.apache.hadoop.io.compress.GzipCodec</value> </property> - Restart the cluster.
 
