SRM has the following main features.
Remote topics are replicated topics referencing a source cluster via a naming convention. For example, the “topic1” topic from the “us-west” source cluster creates the “us-west.topic1” remote topic on the target cluster. SRM automatically applies this configurable "replication policy" across your organization, enabling tooling to distinguish remote topics from source topics. For more information regarding remote topics, see Understanding Replication Flows.
Partitioning and record offsets are synchronized between replicated clusters to ensure consumers can migrate from one cluster to another without losing data or skipping records.
Cross cluster configuration
Topic-level configuration properties and ACL policies are synced across clusters. For example, principals that can read a source topic are automatically able to read any corresponding remote topics. This simplifies managing topics across multiple Kafka clusters.
Consumer group checkpoints
In addition to data and configuration, SRM replicates consumer group progress via periodic
checkpoints. At configurable intervals, checkpoint records are emitted to downstream
clusters, encoding the latest offsets for whitelisted consumer groups and topic-partitions.
As with topics, groups are matched against an allowlist which can be updated dynamically with
srm-control. Normally, consumer group offsets are not portable between
Kafka clusters, as offsets are not consistent between otherwise identical topic-partitions
on different clusters. SRM’s checkpoint records account for this by including offsets which
are automatically translated from one cluster to another. This offset translation feature
works in both directions; a consumer group can be migrated from one cluster to another
(failover) and then back again (failback) without skipping records or losing progress.
Automatic topic and partition detection
SRM monitors Kafka clusters for new topics, partitions, and consumer groups as they are created. These are compared with configurable whitelists, which may include regular expressions.
Tooling to automate consumer migration
SRM tooling enables operators to translate offsets between clusters and to migrate consumer groups while preserving state.
Centralized configuration for multi-cluster environments
SRM leverages a single top-level configuration file to enable replication across multiple Kafka clusters. Moreover, command-line tooling can alter which topics and consumer groups are replicated in real-time.
Since cluster replication will mainly be used for highly critical Kafka applications, it is crucial for customers to be able to easily and reliably monitor the Kafka cluster replications. The custom extensions included with SRM collect and aggregate Kafka replication metrics and make them available through a REST API. This REST API is used by Streams Messaging Manager (SMM) to display metrics. Customers could also use the REST API to implement their own monitoring solution or plug it into third party solutions. The metrics make the state of cluster replication visible to end users who then can take corrective action if needed.