Bidirectional Replication of Active Clusters

Configuration example for two active Kafka clusters setup with bidirectional replication.

A typical scenario involves two active Kafka clusters within the same region but in separate availability zones. Clients can connect to either cluster in case one is temporarily unavailable. This example demonstrates the steps required to set up a deployment with two clusters configured with bidirectional replication. Additionally, it also provides example commands to start replication between clusters.

Figure 1. Bidirectional Replication of Active Clusters

The steps shown here have to be carried out on all clusters in a given deployment. Configuration properties presented in Steps 3-5 are configured identically on all clusters. The configuration property presented in Step 7 will differ for each cluster.

  1. In Cloudera Manager select Streams Replication Manager.
  2. Go to Configuration.
  3. Specify cluster aliases:
    1. Find the Streams Replication Manager Cluster alias property.
    2. Add a comma delimited list of cluster aliases. For example:
      primary, secondary
      Cluster aliases are arbitrary names defined by the user. Aliases specified here are used in other configuration properties and with the srm-control tool to refer to the clusters added for replication.
  4. Specify cluster connection information:
    1. Find the Streams Replication Manager's Replication Configs property.
    2. Click the add button and add new lines for each cluster alias you have specified in the Streams Replication Manager Cluster alias property
    3. Add connection information for your clusters. For example:
      primary.bootstrap.servers=primary_host1:9092,primary_host2:9092,primary_host3:9092
      secondary.bootstrap.servers=secondary_host1:9092,secondary_host2:9092,secondary_host3:9092

      Each cluster has to be added to a new line. If a cluster has multiple hosts, add them to the same line but delimit them with commas.

  5. Add and enable replications:
    1. Find the Streams Replication Manager's Replication Configs property.
    2. Click the add button and add new lines for each unique replication you want to add and enable.
    3. Add and enable your replications. For example:
      primary->secondary.enabled=true
      secondary->primary.enabled=true
      
  6. Enter a Reason for change, and then click Save Changes to commit the changes.
  7. Add Streams Replication Manager Driver role instances to all Kafka broker hosts:
    1. Go to Instances.
    2. Click Add Role Instances.
    3. Click Select Hosts.
    4. Select all Kafka broker hosts and click Ok.
    5. Click Continue.
    6. Find the Streams Replication Manager Driver Target Cluster property.
    7. Add the cluster aliases that you want the driver role to target. For example:
      • On the primary cluster:
        primary
        
      • On the secondary cluster:
        secondary
        
      The Streams Replication Manager Driver Target Cluster property allows you to specify which clusters the driver should write to. In this example, the drivers read data from all clusters, but only write to the cluster they are running on. This allows you to distribute replication workloads.
    8. Click Continue.
  8. Restart Streams Replication Manager.
  9. Replicate data between clusters with the following commands:
    srm-control topics --source primary --target secondary --add ".*"
    
    srm-control topics --source secondary --target primary --add ".*"