Remote Querying

Remote Querying in Streams Replication Manager refers to the Streams Replication Manager Service's capability of querying other, remote Streams Replication Manager Services to fetch the remote cluster replication metrics. This allows users to monitor all replications of a deployment that has multiple instances of Streams Replication Manager through a single Streams Replication Manager Service.

Overview

The Streams Replication Manager Service role gathers, aggregates, and exposes metrics related to cluster replications. While a single cluster of Streams Replication Manager Service roles (Streams Replication Manager Service cluster) can be configured to target and gather metrics from multiple clusters, a setup like this can result in heavily loaded Service roles, which might not be suitable for your deployment. Instead, you can choose to have a single Streams Replication Manager Service cluster connect to other, remote Streams Replication Manager Service clusters and fetch metrics from them. This is called Remote Querying.

Using Remote Querying makes it possible to designate a Streams Replication Manager Service cluster in your deployment to act as a monitoring gateway. The designated Streams Replication Manager Service cluster can then be used to monitor all clusters and replications in your deployment. This way, a single Streams Replication Manager Service cluster can provide you with information on all clusters and replications. In addition, if you have Streams Messaging Manager integrated with the Streams Replication Manager Service cluster acting as the gateway, information regarding all replications will be available in that Streams Messaging Manager instance's UI.

How it works

Remote Streams Replication Manager Service clusters are discovered through Kafka. Streams Replication Manager Service clusters advertise themselves through their target Kafka cluster by writing data into a heartbeats topic. The information advertised is the Service role's protocol, host, port and root API path. When Remote Querying is configured for a specific Streams Replication Manager Service cluster, that Streams Replication Manager Service cluster connects to the specified external Kafka clusters, consumes the heartbeats topics, and based on the advertised information, discovers the remote Streams Replication Manager Service clusters.

Following discovery, a Streams Replication Manager Service cluster can cooperate with its remote counterparts and fetch the metrics related to remote replications. These metrics can then be queried using the Streams Replication Manager REST API, or viewed on the Streams Messaging Manager UI.

Data locality

When the feature is enabled, all metrics are still processed locally. Each Streams Replication Manager Service cluster processes the metrics of its target Kafka cluster only. The Streams Replication Manager Service cluster configured to be the gateway does not take over and process the metrics of the remote Streams Replication Manager Service clusters. It only communicates with the remote Streams Replication Manager Service clusters to fetch and then serve their metrics.

However, because metrics processing remains local, when you enable the feature, additional traffic is generated between the gateway and remote Streams Replication Manager Service clusters. It is important that you take this into consideration especially if one or more of your Streams Replication Manager installations are located in a public cloud environment. For more information on the amount of data generated, see Streams Replication Manager Service data traffic reference.

Remote Querying example

Consider the following deployment:

There are three clusters, cluster A, B, and C. All clusters have Kafka and Streams Replication Manager deployed on them. Cluster A has Streams Messaging Manager installed as well. Bidirectional replication is happening between Cluster A and Cluster B. Additionally, unidirectional replication is set up from Cluster A to Cluster C. Each Streams Replication Manager Service cluster is targeting its co-located Kafka.

In this scenario, Remote Querying is configured and enabled for Streams Replication Manager Service cluster A. This enables you to monitor all replications in the deployment using Streams Replication Manager Service cluster A. Additionally, information regarding all replications can be viewed in Streams Messaging Manager deployed in Cluster A.

Without Remote Querying, Streams Messaging Manager would only be able to display replications that are targeting Cluster A. If you wanted to monitor any other replications, you would need to manually query each Streams Replication Manager Service cluster separately using the REST API, or set up separate instances of Streams Messaging Manager on each of the clusters.