RocksDB state backend configuration

Learn about configuring RocksDB state backend for Flink applications with large state.

The default RocksDB state backend configuration available in Flink is not suitable for YARN applications with large state, as it does not effectively limit the native memory usage of the embedded RocksDB database. To effectively enforce a limit on memory usage, support for Write Buffer Manager is introduced in the ClouderaConfigurableOptionsFactory.

ClouderaConfigurableOptionsFactory can be added to the Flink configuration file to get a good initial configuration of the state backend:
state.backend: rocksdb
state.backend.rocksdb.options-factory: org.apache.flink.contrib.streaming.state.ClouderaConfigurableOptionsFactory

For the default RocksDB state backend configuration, see the Apache Flink documentation. See more information about the Write Buffer Manager here.