Thus far, when we have discussed the
ProcessSession, we have typically referred to it simply as a mechanism for accessing FlowFiles. However, it provides another very important capability, which is transactionality. All methods that are called on a ProcessSession happen as a transaction. When we decided to end the transaction, we can do so either by calling
commit() or by calling
rollback(). Typically, this is handled by the
AbstractProcessor class: if the
onTrigger method throws an Exception, the AbstractProcessor will catch the Exception, call
session.rollback(), and then re-throw the Exception. Otherwise, the AbstractProcessor will call
commit() on the ProcessSession.
There are times, however, that developers will want to roll back a session explicitly. This can be accomplished at any time by calling the
rollback(boolean) method. If using the latter, the boolean indicates whether or not those FlowFiles that have been pulled from queues (via the ProcessSession
get methods) should be penalized before being added back to their queues.
rollback is called, any modification that has occurred to the FlowFiles in that session are discarded, this includes both content and attribute modifications. Additionally, all Provenance Events are rolled back (with the exception of any SEND event that was emitted by passing a value of
true for the
force argument). The FlowFiles that were pulled from the input queues are then transferred back to the input queues (and optionally penalized) so that they can be processed again.
On the other hand, when the
commit method is called, the FlowFile's new state is persisted in the FlowFile Repository, and any Provenance Events that occurred are persisted in the Provenance Repository. The previous content is destroyed (unless another FlowFile references the same piece of content), and the FlowFiles are transferred to the outbound queues so that the next Processors can operate on the data.
It is also important to note how this behavior is affected by using the
org.apache.nifi.annotation.behavior.SupportsBatching annotation. If a Processor utilizes this annotation, calls to
ProcessSession.commit may not take effect immediately. Rather, these commits may be batched together in order to provide higher throughput. However, if at any point, the Processor rolls back the ProcessSession, all changes since the last call to
commit will be discarded and all "batched" commits will take effect. These "batched" commits are not rolled back.