File systems
You can use file systems in Flink to consume and persistently store data for application results, fault tolerance and data recovery.
The file system used for a particular file is determined by its URI scheme. For example,
file:///home/user/text.txt
refers to a file in the local file system. Flink has
built-in support for the file system of the local machine, including any NFS or SAN drives
mounted into that local file system. It can be used by default without additional
configuration.
For all schemes where Flink cannot find a directly supported file system, it falls back to
Hadoop. All Hadoop file systems are automatically available when flink-runtime and the Hadoop
libraries are on the classpath. This way, Flink seamlessly supports all of Hadoop file systems
implementing the org.apache.hadoop.fs.FileSystem
interface, and all
Hadoop-compatible file systems (HCFS).
- Amazon S3
- Azure Data Lake Store Gen2
- Azure Blob Storage
- Google Cloud Storage
For more information about how to use the supported file systems, see the Apache Flink documentation.
Using Ozone with Flink
You can use Ozone as a sink with Flink via the implementation of Hadoop file system, but it has some limitations.
- Limitations using Ozone File System (OFS) with DataStream connector
- When using Ozone as a file system, you must use the
OnCheckpointRollingPolicy
configuration, which rolls part files on every checkpoint. This configuration is needed because in case part files traverse the checkpoint interval, upon recovery from a failure, theFileSink
may use thetruncate()
method of the file system to discard uncommitted data from the in-progress file. This method is not supported by OFS and Flink will throw an exception. - Limitations using OFS with Table API connector
- Because of the limitation present for the DataStream connector, it is required to roll the files on every checkpoint when using the Table API connector. If checkpointing is enabled and a bulk format is used, then this is done automatically. In case of a row format, automatic compaction has to be turned on.