What's New in Streams Replication Manager

Learn about the new features of Streams Replication Manager in Cloudera Runtime 7.1.8.

New property Metric Reporting Period

A new property, Metric Reporting Period (metrics.period), is introduced for the SRM service. This property configures the frequency (in seconds) at which metrics should be reported.

New properties related metrics processor batching

Two new properties are introduced for the SRM Service role. These properties enable configuring the batching behavior of the metrics processor. The new properties are as follows:
  • SRM Service Streams Cache Maximum Bytes Buffering (streams.replication.manager.service.streams.cache.max.bytes.buffering)
  • SRM Service Streams Commit Interval (streams.replication.manager.service.streams.commit.interval.ms)

Internal SRM topics are now automatically added to the replication deny list

SRM now ignores internal SRM topics when replicating data. If required, this behavior can be disabled by adding srm.internal.topic.exclude.enable=false to Streams Replication Manager's Replication Configs in Cloudera Manager.

Log format changes

The SRM Log Format property was moved to the service level from the SRM Service role level, and now applies to both SRM Service and SRM Driver. Changes made to the SRM Log Formatproperty before upgrading to the new version are lost and need to be applied again after the upgrade. The name of the log files of the SRM Driver and SRM Service will contain the host name. Additionally, the SRM Service now accepts all log configurations specified on the Cloudera Manager UI, and does a size-based rotation of the log files instead of a time based rotation.

SRM Service new intra cluster protocol and internal changelog data format

A new intra cluster protocol and a new internal changelog data format when aggregating raw metrics are introduced to the SRM Service. Both the protocol and data format are incompatible with earlier versions. Two new configuration properties are introduced that force the SRM service to use the legacy formats. During an upgrade both of these properties are automatically set to true to maintain backward compatibility. After a successful upgrade, both properties must be cleared, otherwise some SRM features will not function. The properties are as follows:
  • Use Legacy Intra Cluster Host Name Format (streams.replication.manager.use.legacy.intra.cluster.host.name.format)
  • Use Legacy Internal Changelog Data Format(streams.replication.manager.use.legacy.internal.changelog.data.format)

New properties for SRM explicit topic creation

SRM now supports user configured values for partition count and/or replication factor of it's internal topics. Multiple dedicated configurations are introduced:

  • Metrics Topics Partition Count (metrics.topic.partition.count)
  • SRM Control Topics Replication Factor (control.topic.replication.factor)
  • SRM Service Advertisement Topics Replication Factor For Remote Queries (treams.replication.manager.service.remote.advertisement.topic.replication.factor)
  • Streams Applications Internal Topics Replication Factor (streams.replication.manager.service.streams.replication.factor)

Improved error message for the srm-control tool when SECURESTOREPASS is not set

The error message that is displayed when the SECURESTOREPASS environment variable is not set for the srm-control tool has been improved. Instead of failing with a NullPointerException, an error message was added that notifies users that the SECURESTOREPASS is not configured.

SRM Service Remote Querying

A new feature, Remote Querying, is introduced for the SRM Service. Remote Querying enables the SRM Service to query other, remote SRM Services and fetch the metrics gathered by the remote SRM Services. This allows users to monitor all replications of a deployment that has multiple instances of SRM with a single SRM Service. For more information, see Remote Querying and Configuring Remote Querying.

SRM Service Multi Target Cluster Support

The SRM Service can now target and monitor multiple Kafka clusters. Previously, the Service could only target a single cluster. Targeting multiple clusters enables you to monitor multiple remote Kafka clusters using a single SRM deployment. Configuration is done in Cloudera Manager by adding multiple target Kafka cluster aliases to the Streams Replication Manager Service Target Cluster property. For more information, see Configuring the service role target cluster.

New property, Cluster Alias, introduced for Kafka Credentials

A new optional property, Cluster Alias, is introduced for Kafka Credentials. This property can be used to specify the alias of the Kafka cluster that the Kafka Credential defines. If you choose to use this property, the Name property of a Kafka Credential does not have to match the alias of the Kafka cluster that the Kafka Credential defines. If an alias is not specified in Cluster Alias, the value specified in the Name property is used as the alias.

Increase the default replication factor of internal topics to 3

The internal topics used by SRM are now created with a replication factor of 3 by default. As a result, SRM is now more resistant to host failures. Additionally, Cruise Control can now automatically heal SRM’s internal topics in the event of a single host failure.

SRM now creates all internal topics with correct configurations at startup

The internal topics used by SRM are now automatically created with correct configurations at startup. These are the metrics topics, the topics used by the srm-control tool, and the topics used by the SRM Service for service discovery. Additionally, SRM also verifies that the topics are created with correct configurations. If the topics are not configured as expected, SRM fails to start. This improvement fixes CDPD-31745.

Enhance replication status metrics with task information

The SRM Service now reports the status of the Connectors and Tasks in the replication status with detailed descriptions. The status and detailed descriptions can be viewed on the Replications tab of SMM UI.

Auto topic creation is disabled by default on source clusters

A new Connector property, disable.source.topic.auto.creation, is introduced. This property controls whether auto topic creation is enabled for SRM’s internal consumers that fetch data from source clusters. This property is enabled by default meaning that auto topic creation is disabled by default. This property is not directly available for configuration in Cloudera Manager, and does not have a dedicated configuration entry on the UI. Instead, it can be configured through the Streams Replication Manager's Replication Configs Cloudera Manager property. Configuration is possible on a global or replication level.

SRM configuration properties can be configured globally for Connect workers and Connect connectors

The SRM Driver now accepts configuration properties prefixed with the workers. and connectors. prefixes. Configuration properties added to Streams Replication Manager's Replication Configs that use these prefixes are applied globally to all Connect workers or Connect connectors that the SRM Driver creates. For more information regarding the prefixes, see Understanding SRM properties, their configuration and hierarchy. For more information on Connect workers and connectors, see Streams Replication Manager Architecture.

SRM Driver monitoring using Cloudera Manager

Cloudera Manager’s ability to monitor the SRM Driver, its replications, and the overall health of SRM is improved. Most notably, the health status of SRM is based on the health of the network and the availability of replication sources and targets. As a result of this improvement, two new metrics, a new health test, and several new configuration properties are introduced for the SRM Driver in Cloudera Manager.

New metrics and health test
The new metrics are as follows:
  • SRM Driver Distributed Herder Status (streams_replication_manager_distributed_herder_status)
  • Aggregated Status Code of SRM Driver Replication Flows (streams_replication_manager_aggregated_herder_status)

The distributed metric describes the status of individual replications. The aggregate metric provides the aggregate status of all replications.

The new health test is called DISTRIBUTED_HERDER_STATUS. This health test is based on the aggregate metric and provides information on the overall status of SRM and its replications.

New properties
The new monitoring related properties are as follows:
  • Path for driver plugins (plugin.path)
  • Enable HTTP(S) Metrics Reporter (mm.metrics.servlet.enable)
  • SSL Encryption for the Metrics Reporter (metrics.jetty.server.ssl.enabled)
  • HTTP Metrics Reporter Port (metrics.jetty.server.port)
  • HTTPS Metrics Reporter Port (metrics.jetty.server.secureport)
  • Enable Basic Authentication for Metrics Reporter (metrics.jetty.server.authentication.enabled)
  • Metrics Reporter User Name (metrics.jetty.server.auth.username)
  • Metrics Reporter Password (metrics.jetty.server.auth.password)

For more information, see Cloudera Manager Configuration Properties Reference.

SRM Driver now automatically retries setting up replications for unavailable target Kafka clusters

Previously, if any of the Kafka clusters that were targeted by the SRM Driver were unavailable at startup, the SRM Driver stopped. As a result of an improvement, the SRM Driver now instead sets up replications for all target Kafka clusters that are available and continuously retries to set up replication for unavailable clusters. Retry behaviour is configurable in Cloudera Manager. The new properties related to retry behaviour are as follows:
  • Retry Count for SRM Driver (mm.replication.restart.count)
  • Retry Delay for SRM Driver (mm.replication.restart.delay.ms)

For more information see, Cloudera Manager Configuration Properties Reference or Configuring SRM Driver retry behaviour.

Disabled replications can now be fully deactivated by configuring heartbeat emission

As a result of theKafka rebase in this version of Cloudera Runtime, an improvement is introduced in connection with heartbeat emission. From now on, you can fine tune your deployment and fully deactivate any unnecessary replications that are set up by default by configuring heartbeat emission. This can help with minimizing any performance overhead caused by unnecessary replications.

To support this change, an improvement was made for the SRM service in Cloudera Manager. A dedicated configuration property, Enable Heartbeats, is introduced. You can use this property to configure emit.heartbeats.enabled on a global level directly in Cloudera Manager. Replication level overrides are still supported. This can be done by adding emit.heartbeats.enabled with a valid replication prefix to Streams Replication Manager's Replication Configs. For more information on configuring heartbeat emission, see Configuring SRM Driver heartbeat emission.

IdentityReplicationPolicy now available

The IdentityReplicationPolicy is available for use with Streams Replication Manager. This replication policy does not rename remote (replicated) topics. Streams Replication Manager can be configured to use this replication policy by adding the following entry to Streams Replication Manager's Replication Configs:
replication.policy.class=org.apache.kafka.connect.mirror.IdentityReplicationPolicy
For more information, see KAFKA-9726.

SRM Service Basic Authentication support

The SRM Service can now be secured using Basic Authentication. Once Basic Authentication is set up and enabled, the REST API of the SRM Service becomes secured. Any clients or services that connect to the REST API will be required to present valid credentials for access. Configuration is done in Cloudera Manager using SRM configuration properties and external accounts. For more information, see Configuring Basic Authentication for the SRM Service.

SRM automatically creates a Basic Authentication credential for co-located services

SRM automatically creates a Basic Authentication credential for co-located services (you can change the credentials using SRM Service Co-Located Service Username and SRM Service Co-Located Service User Password). When Basic Authentication is enabled, this user is automatically accepted by the SRM Service. For more information, see Configuring Basic Authentication for the SRM Service.