Session Rollback
Thus far, when we have discussed the ProcessSession
, we have typically referred to it simply as a mechanism for accessing FlowFiles. However, it provides another very important capability, which is transactionality. All methods that are called on a ProcessSession happen as a transaction. When we decided to end the transaction, we can do so either by calling commit()
or by calling rollback()
. Typically, this is handled by the AbstractProcessor
class: if the onTrigger
method throws an Exception, the AbstractProcessor will catch the Exception, call session.rollback()
, and then re-throw the Exception. Otherwise, the AbstractProcessor will call commit()
on the ProcessSession.
There are times, however, that developers will want to roll back a session explicitly. This can be accomplished at any time by calling the rollback()
or rollback(boolean)
method. If using the latter, the boolean indicates whether or not those FlowFiles that have been pulled from queues (via the ProcessSession get
methods) should be penalized before being added back to their queues.
When rollback
is called, any modification that has occurred to the FlowFiles in that session are discarded, to included both content modification and attribute modification. Additionally, all Provenance Events are rolled back (with the exception of any SEND event that was emitted by passing a value of true
for the force
argument). The FlowFiles that were pulled from the input queues are then transferred back to the input queues (and optionally penalized) so that they can be processed again.
On the other hand, when the commit
method is called, the FlowFile's new state is persisted in the FlowFile Repository, and any Provenance Events that occurred are persisted in the Provenance Repository. The previous content is destroyed (unless another FlowFile references the same piece of content), and the FlowFiles are transferred to the outbound queues so that the next Processors can operate on the data.
It is also important to note how this behavior is affected by using the org.apache.nifi.annotations.behavior.SupportsBatching
annotation. If a Processor utilizes this annotation, calls to ProcessSession.commit
may not take affect immediately. Rather, these commits may be batched together in order to provide higher throughput. However, if at any point, the Processor rolls back the ProcessSession, all changes since the last call to commit
will be discarded and all "batched" commits will take affect. These "batched" commits are not rolled back.