Adding Functionality to Apache NiFi
Also available as:
PDF
loading table of contents...

FlowFile

A FlowFile is a logical notion that correlates a piece of data with a set of Attributes about that data. Such attributes include a FlowFile's unique identifier, as well as its name, size, and any number of other flow-specific values. While the contents and attributes of a FlowFile can change, the FlowFile object is immutable. Modifications to a FlowFile are made possible by the ProcessSession.

The core attributes for FlowFiles are defined in the org.apache.nifi.flowfile.attributes.CoreAttributes enum. The most common attributes you'll see are filename, path and uuid. The string in parentheses is the value of the attribute within the CoreAttributes enum and how it appears in the UI/API.

  • Filename (filename): The filename of the FlowFile. The filename should not contain any directory structure.

  • UUID (uuid): A Universally Unique Identifier assigned to this FlowFile that distinguishes the FlowFile from other FlowFiles in the system.

  • Path (path): The FlowFile's path indicates the relative directory to which a FlowFile belongs and does not contain the filename.

  • Absolute Path (absolute.path): The FlowFile's absolute path indicates the absolute directory to which a FlowFile belongs and does not contain the filename.

  • Priority (priority): A numeric value indicating the FlowFile priority.

  • MIME Type (mime.type): The MIME Type of this FlowFile.

  • Discard Reason (discard.reason): Specifies the reason that a FlowFile is being discarded.

  • Alternate Identifier (alternate.identifier): Indicates an identifier other than the FlowFile's UUID that is known to refer to this FlowFile.

Additional Common Attributes

While these attributes are not members of the CoreAttributes enum, they are de facto standards across the system and found on most FlowFiles.

  • File Size (fileSize): The size of the FlowFile content in bytes.

  • Entry Date (entryDate): The date and time at which the FlowFile entered the system (i.e., was created). The value of this attribute is a number that represents the number of milliseconds since midnight, Jan. 1, 1970 (UTC).

  • Lineage Start Date (lineageStartDate): Any time that a FlowFile is cloned, merged, or split, this results in a "child" FlowFile being created. As those children are then cloned, merged, or split, a chain of ancestors is built. This value represents the date and time at which the oldest ancestor entered the system. Another way to think about this is that this attribute represents the latency of the FlowFile through the system. The value is a number that represents the number of milliseconds since midnight, Jan. 1, 1970 (UTC).