Apache Kafka

Apache Kafka is a distributed commit log service that functions much like a publish/subscribe messaging system, but with better throughput, built-in partitioning, replication, and fault tolerance. Kafka provides the following:

  • Persistent messaging with O(1) disk structures that provide constant time performance, even with terabytes of stored messages.
  • High throughput, supporting hundreds of thousands of messages per second, even with modest hardware.
  • Explicit support for partitioning messages over Kafka servers and distributing consumption over a cluster of consumer machines while maintaining per-partition ordering semantics.
  • Support for parallel data load into Hadoop.
Setting up Kafka
Describes how to set up Kafka after installation.
Integrating with Kafka
Describes how to configure security, manage mulitple Kafka versions, manage topics across multiple Kafka clusters, set up an end-to-end streaming pipeline, develop Kafka clients, and manage metrics.
Administering Kafka
Describes how to administer Kafka.
Kafka Performance Tuning
Describes best practices for Kafka performance tuning.
Using Kafka Streams
Describes where to get information about using Kafka Streams.