Configuring clusters and replications

You can expand an existing deployment of Streams Replication Manager by adding new clusters and replications to the configuration. To do this, you need to specify cluster aliases and cluster connection information, as well as add and enable replications.

Specifying your clusters and enabling replications does not start replication of data itself. When clusters and replications are added with the following method to the configuration, SRM will connect and set up communication with them, but will not automatically replicate any data. To start replicating data you need to specify which topics to replicate with the srm-control command line tool.

Use the following steps as reference when you want to add new clusters or replications to your deployment.

  • If you are planning on replicating data to or from a Kafka service running in either a CDH 5.x or 6.x cluster and you are using Sentry for authorization, make sure that the streamsrepmgr user is added to the Kafka Super users property. You can find the Super users property by going to Kafka service > Configuration. Do this on all CDH 5.x or 6.x clusters where data replication will happen.

  • If you are planning on replicating data to or from a Kafka service running in Runtime 7.x and you are using Ranger for authorization, make sure that the streamsrepmgr user has all required permissions assigned to it in Ranger. Do this on all Runtime 7.x clusters where data replication will happen.
  1. In Cloudera Manager, select Streams Replication Manager.
  2. Go to Configuration.
  3. Specify cluster aliases:
    1. Find the Streams Replication Manager Cluster alias property.
    2. Add a comma delimited list of cluster aliases. For example:
      primary, secondary
      Cluster aliases are arbitrary names defined by the user. Aliases specified here are used in other configuration properties and with the srm-control tool to refer to the clusters added for replication.
  4. Specify cluster connection information:
    1. Find the Streams Replication Manager's Replication Configs property.
    2. Click the add button and add new lines for each cluster alias you have specified in the Streams Replication Manager Cluster alias property
    3. Add connection information for your clusters. For example:
      primary.bootstrap.servers=primary_host1:9092,primary_host2:9092,primary_host3:9092
      secondary.bootstrap.servers=secondary_host1:9092,secondary_host2:9092,secondary_host3:9092

      Each cluster has to be added to a new line. If a cluster has multiple hosts, add them to the same line but delimit them with commas.

  5. Add and enable replications:
    1. Find the Streams Replication Manager's Replication Configs property.
    2. Click the add button and add new lines for each unique replication you want to add and enable.
    3. Add and enable your replications. For example:
      primary->secondary.enabled=true
      secondary->primary.enabled=true
      
  6. Enter a Reason for change, and then click Save Changes to commit the changes.
  7. Restart Streams Replication Manager.
Replicating data to or from the specified clusters is now possible.
Use the srm-control tool to kick off replication by adding topics or groups to the allowlist.