Splits a binary encoded Avro datafile into smaller files based on the configured Output Size. The Output Strategy determines if the smaller files will be Avro datafiles, or bare Avro records with metadata in the FlowFile attributes. The output will always be binary encoded.
Display Name | API Name | Default Value | Allowable Values | Description |
---|
Split Strategy | Split Strategy | Record | - Record
| The strategy for splitting the incoming datafile. The Record strategy will read the incoming datafile by de-serializing each record. |
Output Size | Output Size | 1 | | The number of Avro records to include per split file. In cases where the incoming file has less records than the Output Size, or when the total number of records does not divide evenly by the Output Size, it is possible to get a split file with less records. |
Output Strategy | Output Strategy | Datafile | - Datafile
- Bare Record
| Determines the format of the output. Either Avro Datafile, or bare record. Bare record output is only intended for use with systems that already require it, and shouldn't be needed for normal use. |
Transfer Metadata | Transfer Metadata | true | | Whether or not to transfer metadata from the parent datafile to the children. If the Output Strategy is Bare Record, then the metadata will be stored as FlowFile attributes, otherwise it will be in the Datafile header. |