Enabling checkpoints for Flink applications
To make your Flink application fault tolerant, you need to enable automatic checkpointing. When an error or a failure occurs, Flink will automatically restarts and restores the state from the last successful checkpoint. Checkpointing is not enabled by default.
While it is possible to enable checkpointing programmatically through the
StreamExecutionEnvironment
, Cloudera recommends to enable checkpointing
either using the configuration file for each job, or as a default configuration for all Flink
applications through Cloudera Manager.
To enable checkpointing, you need to set the
execution.checkpointing.interval
configuration option to a value larger
than 0. It is recommended to start with a checkpoint interval of 10 minutes (600000
milliseconds).