Topic configuration syncing
Streams Replication Manager replicates topic configurations and Access Control Lists in addition to data. Configuring how default values are handled ensures consistency between source and target clusters.
By default, Streams Replication Manager applies the default configuration values of the target cluster for any property that uses the default value on the source cluster. If the source and target clusters use different default configuration values, this behavior results in inconsistencies between the source and replicated topics.
The following example scenario illustrates how data loss can occur:
- Differing Defaults: The
retention.msproperty is not explicitly set on the topic, so it uses the cluster default. The source cluster is configured to keep data for seven days, while the target cluster keeps them only for one day. - Log Rolling: As data is replicated, the target cluster writes it to log segments. Once a segment is full or the time limit is reached, the log rolls.
- Premature Deletion: The target cluster checks its local retention policy, identifies the closed segments that are older than one day, and deletes them.
- Data Inconsistency: The source cluster still retains this data because of its seven-day retention policy.
Result: The replicated data is removed from the target cluster while it still exists on the source. This breaks data consistency, meaning consumers encounter missing historical data when they fail over to the target cluster
Configuring default value replication
To prevent configuration inconsistencies, you can configure the
use.defaults.from property in the Streams Replication Manager
Driver configuration. This property controls whether the replication process respects the
default values of the source cluster or the target cluster.
By default, this property is set to target. To ensure that default values from the source cluster are replicated to the target topic, set the property to source.
