Apache Storm Component Guide
Also available as:
PDF
loading table of contents...

Saving Window State

One of the issues with windowing is that tuples cannot be acked until they fall completely out of the window.

For example, consider a one hour window that slides every minute. The tuples in the window are evaluated (passed to the bolt execute method) every minute, but tuples that arrived during the first minute are acked only after one hour and one minute. If there is a system outage after one hour, Storm will replay all tuples from start through the sixtieth minute. The bolt’s execute method will be invoked with the same set of tuples (60 times); every window will be reevaluated. One way to avoid this is to track tuples that have already been evaluated, save this information in an external durable location, and use this information to prune the duplicate window evaluation during recovery.

For more information about state management and how it can be used to avoid duplicate window evaluations, see Implementing State Management.