HDFS Sink
Learn more about the HDFS Sink connector.
The HDFS Sink connector can be used to transfer data from Kafka topics to files on HDFS
clusters. Each partition of every topic results in a collection of files named in the
following
pattern:
{topic name}_{partition number}_{end_offset}.{file extension}For example, running the HDFS Sink connector on partition 0 of a topic named
sourceTopic
can yield the following series of
files:sourceTopic_0_50.avro - for record 0 ~ 50 sourceTopic_0_79.avro - holding record 51 ~ 79 ...The HDFS Sink connector periodically commits records to final result files. Each commit results in a separate "chunk" file.