ModifyCompression

Description:

Changes the compression algorithm used to compress the contents of a FlowFile by decompressing the contents of FlowFiles using a user-specified compression algorithm and recompressing the contents using the specified compression format properties. This processor operates in a very memory efficient way so very large objects well beyond the heap size are generally fine to process

Tags:

content, compress, recompress, gzip, bzip2, lzma, xz-lzma2, snappy, snappy-hadoop, snappy framed, lz4-framed, deflate, zstd, brotli

Properties:

In the list below, the names of required properties appear in bold. Any other properties (not in bold) are considered optional. The table also indicates any default values.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Input Compression StrategyInput Compression Strategyno compression
  • no compression No Compression
  • use mime.type attribute Use the [mime.type] attribute from the input FlowFile to determine the format
  • gzip GZIP
  • deflate Deflate
  • bzip2 BZIP2
  • xz-lzma2 XZ-LZMA2
  • lzma LZMA
  • snappy Snappy
  • snappy-framed Snappy-Framed
  • lz4-framed LZ4
  • zstd ZSTD
  • brotli Brotli
The strategy to use for decompressing input FlowFiles
Output Compression StrategyOutput Compression Strategyno compression
  • no compression No Compression
  • gzip GZIP
  • deflate Deflate
  • bzip2 BZIP2
  • xz-lzma2 XZ-LZMA2
  • lzma LZMA
  • snappy Snappy
  • snappy-hadoop Snappy-Hadoop
  • snappy-framed Snappy-Framed
  • lz4-framed LZ4
  • zstd ZSTD
  • brotli Brotli
The strategy to use for compressing output FlowFiles
Output Compression LevelOutput Compression Level1
  • 0
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
The compression level for output FlowFiles for supported formats. A lower value results in faster processing but less compression; a value of 0 indicates no (that is, simple archiving) for gzip or minimal for xz-lzma2 compression. Higher levels can mean much larger memory usage such as the case with levels 7-9 for xz-lzma/2 so be careful relative to heap size.

This Property is only considered if the [Output Compression Strategy] Property is set to one of the following values: [zstd], [deflate], [brotli], [gzip], [xz-lzma2]
Output Filename StrategyOutput Filename StrategyUpdated
  • Original Retain the filename attribute value from the input FlowFile
  • Updated Remove the filename extension when decompressing and add a new extension for compressed output FlowFiles
Processing strategy for filename attribute on output FlowFiles

Relationships:

NameDescription
failureFlowFiles will be transferred to the failure relationship on compression modification errors
successFlowFiles will be transferred to the success relationship on compression modification success

Reads Attributes:

NameDescription
mime.typeIf the Decompression Format is set to 'use mime.type attribute', this attribute is used to determine the decompression type. Otherwise, this attribute is ignored.

Writes Attributes:

NameDescription
mime.typeThe appropriate MIME Type is set based on the value of the Compression Format property. If the Compression Format is 'no compression' this attribute is removed as the MIME Type is no longer known.

State management:

This component does not store state.

Restricted:

This component is not restricted.

Input requirement:

This component requires an incoming relationship.

System Resource Considerations:

ResourceDescription
CPUAn instance of this component can cause high usage of this system resource. Multiple instances or high concurrency settings may result a degradation of performance.
MEMORYAn instance of this component can cause high usage of this system resource. Multiple instances or high concurrency settings may result a degradation of performance.