Ingesting Data from Kafka
KafkaSpout reads from Kafka topics. To do so, it needs to connect to the Kafka broker, locate the topic from which it will read, and store consumer offset information (using the ZooKeeper root and consumer group ID). If a failure occurs, KafkaSpout can use the offset to continue reading messages from the point where the operation failed.
The storm-kafka
components include a core Storm spout and a fully
transactional Trident spout. Storm-Kafka spouts provide the following key features:
-
'Exactly once' tuple processing with the Trident API
-
Dynamic discovery of Kafka brokers and partitions
You should use the Trident API unless your application requires sub-second latency.