Add and Configure SRM

Learn how to add SRM to your cluster.

Streams Replication Manager is comprised of two roles:
  • Streams Replication Manager Driver role: This role is responsible for connecting to the specified clusters and performing replication between them. The driver can be installed on one or more hosts.
  • Streams Replication Manager Service role: This role consist of a REST API and a Kafka Streams application to aggregate and expose cluster, topic and consumer group metrics. The service can be installed on one host only.
You can install Streams Replication Manager independent of the clusters that replication is happening between.

The following steps walk you through the process of adding SRM to your cluster. The configuration examples on this page are simple examples that are meant to demonstrate the type of information that you have to enter. For comprehensive deployment and configuration examples, see Deployment recommendations and Configuration examples.

If Kafka is configured to use Sentry authorization, make sure that streamsrepmgr is added to the Kafka Super users property.
  1. From the Cloudera Manager Home page, select the drop-down to the right of your cluster, and select Add a Service and select Streams Replication Manager. You may install one service at a time.
  2. Select Streams Replication Manager from the list of services and click Continue.
  3. Assign role instances to hosts:
    1. Click the field below Streams Replication Manager Driver to display a dialog box containing a list of hosts.
    2. Select 1 or more hosts that the Streams Replication Manager Driver should be assigned to and Click Ok.
    3. Click the field below Streams Replication Manager Service to display a dialog box containing a list of hosts.
    4. Select 1 host that the Streams Replication Manager Service should be assigned to and Click Ok.
  4. Click Continue.
  5. Specify cluster aliases:
    1. Find the Streams Replication Manager Cluster alias. property.
    2. Add a comma delimited list of cluster aliases. For example:
      primary, secondary
      Cluster aliases are arbitrary names defined by the user. Aliases specified here are used in other configuration properties and with the srm-control tool to refer to the clusters added for replication.
  6. Specify cluster connection information:
    1. Find the streams.replication.manager's replication configs property.
    2. Click the add button and add new lines for each cluster alias you have specified in the Streams Replication Manager Cluster alias. property
    3. Add connection information for your clusters. For example:
      primary.bootstrap.servers=primary_host1:9092,primary_host2:9092,primary_host3:9092
      secondary.bootstrap.servers=secondary_host1:9092,secondary_host2:9092,secondary_host3:9092

      Each cluster has to be added to a new line. If a cluster has multiple hosts, add them to the same line but delimit them with commas.

  7. Add and enable replications:
    1. Find the streams.replication.manager's replication configs property.
    2. Click the add button and add new lines for each unique replication you want to add and enable.
    3. Add and enable your replications. For example:
      primary->secondary.enabled=true
      secondary->primary.enabled=true
      
  8. Specify the Streams Replication Manager Service role target cluster:
    1. Find the streams replication manager service target cluster property.
    2. Add the target cluster alias. For example:
      secondary

    The target cluster is where the service gathers replication information from. Cloudera recommends that you deploy the service on every cluster and configure each instance of the service to target the cluster that it is running on.

  9. Optional: Specify the Streams Replication Manager Driver role target clusters:
    1. Find the streams replication manager driver target cluster property.
    2. Add the cluster aliases that you want the driver role to target. For example:
      primary, secondary

    You can use the streams replication manager driver target cluster property to specify a subset of clusters that the driver should target or in other words write to. When this property is left empty (default) the driver will read from and write to all clusters added to SRMs configuration. When this property is set, the driver will collect data from all clusters, but will only write to the clusters specified in this property. This property becomes essential when you have an advanced deployment as it allows you to distribute replication workloads.

  10. Configure properties not exposed in Cloudera Manager:
    SRM accepts a number of additional configuration properties that are not available in Cloudera Manager. Depending on your requirements you may need to configure these properties as well. You can find a comprehensive list of these properties in Configuration Properties Reference.
    1. Find the streams.replication.manager's replication configs property.
    2. Click the add button and add new lines for each additional property you want to configure.
    3. Add configuration properties. For example:
      replication.factor=3
  11. Depending on your requirement, review and configure other properties available on this page.
  12. Click Continue and wait until Streams Replication Manager is started.
  13. Click Continue then click Finish.
  • Replicating data to or from the specified clusters is now possible.
  • The SRM service REST API Swagger UI is available at one of the following addresses:
    • http://<srm-service-host>:<srm-service-port/swagger
    • https://<srm-service-host>:<srm-service-port/swagger
  • Enable Kerberos and TLS/SSL for SRM.
  • Use the srm-control tool located at /opt/cloudera/parcels/STREAMS_REPLICATION_MANAGER/bin/ to kick off replication by adding topics or groups to the whitelist.