HDFS Sink Connector
Learn more about the HDFS Sink Connector.
The HDFS Sink Connector can be used to transfer data from Kafka topics to files on HDFS
clusters. Each partition of every topic results in a collection of files named in the
following
pattern:
{topic name}_{partition number}_{end_offset}.{file extension}
For
example, running the HDFS Sink Connector on partition 0 of a topic named
sourceTopic
can yield the following series of
files:sourceTopic_0_50.avro - for record 0 ~ 50
sourceTopic_0_79.avro - holding record 51 ~ 79
...
The
HDFS Sink Connector periodically commits records to final result files. Each commit results in
a separate "chunk" file.