Configuring snapshots
Learn about snapshots and configuring snapshot behavior in Cloudera Surveyor. Snapshots control how frequently data is presented on the UI.
About snapshots
Cloudera Surveyor automatically collects snapshots of Kafka cluster data at configured intervals. These snapshots capture the current state of topics, partitions, consumer groups, and other cluster metadata. Most data presented on the UI is based on these snapshots.
Because of snapshots, the majority of changes in Kafka clusters only appear on the UI after the next scheduled snapshot is taken. For example, if snapshots occur every 10 minutes, a new topic created in Kafka might not be visible in the UI for up to 10 minutes.
The snapshot system provides multiple configuration settings to control data collection intervals, timeouts, and resource usage. Configuration enables you to fine-tune data collection behavior. For example, more frequent snapshots provide fresher data but consume more resources, while less frequent snapshots reduce resource usage but may show older data.
Configuration levels and time format
Most snapshot settings can be configured at two levels:
-
Global settings – Apply to all clusters by default. Configured using
surveyorConfig.surveyor.*properties. -
Per-cluster overrides – Override global settings for specific clusters. Configured using
clusterConfigs.clusters[*].*properties.
Per-cluster settings take precedence over global settings, allowing you to customize behavior for individual clusters while maintaining sensible defaults for all others.
All duration and interval values throughout the snapshot configuration are specified
in ISO-8601 format. For example, PT5M for 5 minutes,
PT1H for 1 hour, and PT30S for 30 seconds.
Configuring snapshot intervals
Configure how frequently Cloudera Surveyor collects snapshots from registered Kafka clusters.
Snapshot intervals determine how frequently Cloudera Surveyor collects data from Kafka clusters. The snapshot interval has a direct effect on how fresh the data is that is presented on the UI.
Configure the snapshot interval globally for all clusters
using surveyorConfig.surveyor.globalSnapshotInterval or on a per
cluster basis using clusterConfigs.clusters[*].snapshotInterval.
High-activity clusters might benefit from more frequent snapshots, while stable
clusters can use longer intervals to reduce resource usage.
The following example configures a global snapshot interval of 10 minutes and configures per cluster overrides for a production and development cluster.
#...
surveyorConfig:
surveyor:
globalSnapshotInterval: PT10M
clusterConfigs:
clusters:
- clusterName: "production-cluster"
bootstrapServers: "prod-kafka-1:9092,prod-kafka-2:9092"
snapshotInterval: PT5M
- clusterName: "development-cluster"
bootstrapServers: "dev-kafka:9092"
snapshotInterval: PT30M
- clusterName: "staging-cluster"
bootstrapServers: "staging-kafka:9092"The
production-cluster has a decreased interval for more frequent
updates, the development-cluster has a much longer interval, while the
staging-cluster has no overrides and uses the global
default.Configuring snapshot reliability settings
Configure advanced snapshot settings to optimize reliability and performance for clusters with specific network conditions, resource constraints, or availability requirements.
Beyond basic snapshot intervals, Cloudera Surveyor provides additional properties to handle various operational scenarios. Use these settings to fine-tune snapshotting behavior. Timeouts prevent hanging on slow clusters, time-to-live (TTL) settings maintain data availability during temporary failures, and resource management settings control system load.
