Creating checkpoints and savepoints in Flink

You can use checkpoints and savepoints to make Flink applications fault tolerant throughout the whole pipeline. With checkpoints and savepoints, you can create a backup mechanism from which you can restore your whole application, with or without state, in case of failure or upgrade.

Flink contains a fault tolerance mechanism that creates snapshots of the data stream continuously. The snapshot includes not only the dataflow, but the state attached to it. In case of failure, the latest snapshot is chosen and the system recovers from that checkpoint. This guarantees that the result of the computation can always be consistently restored.

While checkpoints are created and managed by Flink, savepoints are controlled by the user. A savepoint can be described as a backup from the executed process.