Building a High-Throughput Messaging System with Apache Kafka
Apache Kafka is a fast, scalable, durable, fault-tolerant publish-subscribe messaging system. Common use cases include:
-
Stream processing
-
Messaging
-
Website activity tracking
-
Metrics collection and monitoring
-
Log aggregation
-
Event sourcing
-
Distributed commit logging
Kafka works with Apache Storm and Apache Spark for real-time analysis and rendering of streaming data. The combination of messaging and processing technologies enables stream processing at linear scale.
For example, Apache Storm ships with support for Kafka as a data source using Storm’s core API or the higher-level, micro-batching Trident API. Storm’s Kafka integration also includes support for writing data to Kafka, which enables complex data flows between components in a Hadoop-based architecture.