Configure SRM for Failover and Failback

Learn how to configure Streams Replication Manager (SRM) for failover and failback.

To prepare for a failover or failback scenario you must set up SRM with bidirectional replication. Additionally, you must ensure that all mission critical topics and consumer groups are added to the allowlist on both the primary and backup clusters. Optionally, you can also choose to enable automatic group offset synchronization, which can simplify the steps you need to take when migrating consumer groups.

  • Ensure that the SRM service is configured and all clusters are taking part in the replication process are defined and added to SRM's configuration. For more information, see Defining and adding clusters for replication.
  • Ensure that the srm-control tool is configured. For more information, see Configuring srm-control.
  • If you plan on enabling automatic group offset synchronization, ensure that you review Configuring automatic group offset synchronization. Although the basic steps are provided here, the feature has some limitations that you must be aware of.
  • The following steps and examples are for a replication scenario with two clusters. These are referred to as primary and secondary.
  1. Set up bidirectional replication between clusters:
    1. In Cloudera Manager, go to Clusters and select the Streams Replication Manager service.
    2. Go to Configuration
    3. Find the Streams Replication Manager's Replication Configs property.
    4. Click the add button and add new lines for each unique replication you want to add and enable.
    5. Add and enable your replications.

      For example, assume that you have two clusters, primary and secondary. To enable bidirectional replication between these clusters, you must add two unique replications. One that replicates from primary to secondary and another that replicates from secondary to primary.

      primary->secondary.enabled=true
      secondary->primary.enabled=true
      
  2. Optional: Enable and configure automatic group offset synchronization.

    This can be done by enabling sync.group.offsets.enabled. Optionally, if you want to customize the frequency of offset synchronization, you can also set sync.group.offsets.interval.seconds. Both properties are configured by adding them to Streams Replication Manager's Replication Configs. For example:

    sync.group.offsets.enabled = true
    sync.group.offsets.interval.seconds = [***TIME IN SECONDS***]
  3. Enter a Reason for change, and then click Save Changes to commit the changes.
  4. Restart Streams Replication Manager.
  5. Add the required consumer groups and topics to the allowlist on the primary cluster.
    • Add groups:
      srm-control groups --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --add [GROUP1],[GROUP2]
    • Add topics:
      srm-control topics --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --add [TOPIC1],[TOPIC2]
  6. Add the required consumer groups and topics to the allowlist on the secondary cluster.
    • Add groups:
      srm-control groups --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --add [GROUP1],[GROUP2]
      
    • Add topics:
      srm-control topics --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --add [PRIMARY_CLUSTER.TOPIC1],[PRIMARY_CLUSTER.TOPIC2]
  7. Verify that all required topics and consumer groups are added to the allowlist.
    • Verify consumer groups:
      srm-control groups --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --list
      srm-control groups --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --list
      
    • Verify topics:
      srm-control topics --source [PRIMARY_CLUSTER] --target [SECONDARY_CLUSTER] --list
      srm-control topics --source [SECONDARY_CLUSTER] --target [PRIMARY_CLUSTER] --list
      

SRM is set up with bidirectional replication and all mission critical topics and consumer groups are added to the allowlist on both the primary and secondary clusters.