Configuration example for two active Kafka clusters setup with bidirectional
replication.
A typical scenario involves two active Kafka clusters within the same region but in
separate availability zones. Clients can connect to either cluster in case one is
temporarily unavailable. This example demonstrates the steps required to set up a deployment
with two clusters configured with bidirectional replication. Additionally, it also provides
example commands to start replication between clusters.
The steps shown here have to be carried out on all clusters in a given
deployment. Configuration properties presented in Steps 3-5 are configured identically on
all clusters. The configuration property presented in Step 7 will differ for each
cluster.
In Cloudera Manager select Streams Replication Manager.
Go to Configuration.
Specify cluster aliases:
Find the Streams Replication Manager
Cluster alias. property.
Add a comma
delimited list of cluster aliases. For example:
primary, secondary
Cluster aliases are arbitrary names
defined by the user. Aliases specified here are used in other configuration properties
and with the srm-control tool to refer to the clusters added for
replication.
Specify cluster connection
information:
Find the
streams.replication.manager's replication configs
property.
Click the add button and add new lines
for each cluster alias you have specified in the Streams Replication
Manager Cluster alias. property
Add connection information for your
clusters. For example:
Enter a Reason for change, and then click Save
Changes to commit the changes.
Add
Streams Replication Manager Driver role instances to all Kafka broker
hosts:
Go to
Instances.
Click Add Role
Instances.
Click Select
Hosts.
Select all Kafka broker hosts
and click Ok.
Click
Continue.
Find the streams
replication manager driver target cluster property.
Add the cluster aliases that you want
the driver role to target. For
example:
On the primary cluster:
primary
On the secondary cluster:
secondary
The streams replication manager driver target
cluster property allows you to specify which clusters the driver should
write to. In this example, the drivers read data from all clusters, but only write to
the cluster they are running on. This allows you to distribute replication
workloads.
Click
Continue.
Restart Streams Replication Manager.
Replicate data between clusters with the following commands: