Configure SRM for Failover and Failback

Learn how to configure SRM for failover and failback.

To prepare for a failover or failback scenario you have to set up SRM with bidirectional replication. Additionally, you have to make sure that all mission critical topics and consumer groups are whitelisted on both the primary and backup clusters. Optionally, you can also choose to enable automatic group offset synchronization, which can simplify the steps you need to take when migrating consumer groups.

  1. In Cloudera Manager, select Streams Replication Manager.
  2. Go to Configuration.
  3. Set up bidirectional replication between clusters:
    1. Find the Streams Replication Manager Cluster alias property.
    2. Add a comma delimited list of cluster aliases. For example:
      primary, secondary
    3. Find the Streams Replication Manager's Replication Configs property.
    4. Click the add button and add new lines for each cluster alias you have specified in the Streams Replication Manager Cluster alias property
    5. Add connection information for your clusters. For example:
      primary.bootstrap.servers=primary_host1:9092,primary_host2:9092,primary_host3:9092
      secondary.bootstrap.servers=secondary_host1:9092,secondary_host2:9092,secondary_host3:9092

      Each cluster has to be added to a new line. If a cluster has multiple hosts, add them to the same line but delimit them with commas.

    6. Click the add button and add new lines for each unique replication you want to add and enable.
    7. Add and enable your replications. For example:
      primary->secondary.enabled=true
      secondary->primary.enabled=true
      
  4. Optional: Enable and configure automatic group offset synchronization.

    This can be done by enabling sync.group.offsets.enabled. Optionally, if you want to customize the frequency of offset synchronization, you can also set sync.group.offsets.interval.seconds. Both properties are configured by adding them to Streams Replication Manager's Replication Configs. For example:

    sync.group.offsets.enabled = true
    sync.group.offsets.interval.seconds = [***TIME IN SECONDS***]
  5. Enter a Reason for change, and then click Save Changes to commit the changes.
  6. Restart Streams Replication Manager.
  7. Whitelist required consumer groups and topics on the primary cluster.
    1. Whitelist groups:
      srm-control groups --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --add [GROUP1],[GROUP2]
    1. Whitelist topics:
      srm-control topics --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --add [TOPIC1],[TOPIC2]
  8. Whitelist required remote topics and consumer groups on the secondary cluster.
    1. Whitelist remote groups:
      srm-control groups --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --add [GROUP1],[GROUP2]
      
    2. Whitelist remote topics:
      srm-control topics --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --add [PRIMARY_CLUSTER.TOPIC1],[PRIMARY_CLUSTER.TOPIC2]
  9. Verify that all required topics and consumer groups are whitelisted.
    1. Verify consumer groups:
      srm-control groups --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --list
      srm-control groups --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --list
      
    2. Verify topics:
      srm-control topics --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --list
      srm-control topics --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --list
      

SRM is set up with bidirectional replication and all mission critical topics and consumer groups are whitelisted on both the primary and secondary clusters.