Setting capacity estimations and goals
Cruise Control rebalancing works using capacity estimations and goals. You need to configure the capacity estimates based on your resources, and set the goals for Cruise Control to achieve the Kafka partition rebalancing that meets your requirements.
When configuring Cruise Control, you need to make sure that the Kafka topics and partitions, the capacity estimates, and the proper goals are provided so the rebalancing process works as expected.
- Go to your cluster in Cloudera Manager.
- Select Cloudera Manager from the services.
- Select Cruise Control from the list of Services.
- Click Configuration.
- Select Main from the Filters.
Configuring capacity estimations
The values for capacity estimation needs to be provided based on your available resources for CPU and network. Beside the capacity estimation, you also need to provide information about the broker and partition metrics. You can set the capacity estimations and Kafka properties in Cloudera Manager.
Capacity | Description |
---|---|
capacity.default.cpu |
100 by default |
capacity.default.network-in |
Given by the internet provider |
capacity.default.network-out |
The optimizers in Cruise Control use the network incoming and outgoing capacities to define a
boundary for optimization. The capacity estimates are generated and read by Cruise Control. A
capacity.json
file is generated when Cruise Control is started. When a new
broker is added, Cruise Control uses the default broker capacity values. However, in case disk
related goals are used, Cruise Control must be restarted to load the actual disk capacity metrics
of the new broker.
The following table lists all the configurations that are needed to configure Cruise Control specifically to your environment:
Configuration | Description |
---|---|
num.metric.fetchers |
Parallel threads for fetching metrics from the Cloudera Manager database |
partition.metric.sample.store.topic |
Storing Cruise Control metrics |
broker.metric.sample.store.topic |
Storing Cruise Control metircs |
partition.metrics.window.ms |
Time window size for partition metrics |
broker.metrics.window.ms |
Time window size for broker metrics |
num.partition.metrics.windows |
Number of stored partition windows |
num.broker.metrics.windows |
Number of stored broker windows |
Configuring goals
After setting the capacity estimates, you can specify which goals need to be used for the rebalancing process in Cloudera Manager. The provided goals are used for the optimization proposal of your Kafka cluster.
Example of Cruise Control goal configuration
By default, Cruise Control is configured with a set of Default, Supported, Hard, Self-healing and Anomaly detection goals in Cloudera Manager. The default configurations can be changed based on what you would like to achieve with the rebalancing.
- Find dead/failed brokers and create an anomaly to remove load from them
(
self.healing.broker.failure.enabled
) - Move load back to the brokers when the brokers are available again
(
self.healing.goal.violation.enabled
and added goals) - Prevent too frequent rebalances to reduce cluster costs (incremented thresholds, reduced
self.healing.goals
set) - Have an always balanced cluster from the replicas and leader replicas point of view
- Not enable every type of self-healing methods if it is not required (only two type of self-healing is enabled)
self.healing.goal.violation.enabled=true
self.healing.broker.failure.enabled=true
self.healing.exclude.recently.removed.brokers=false
anomaly.notifier.class=com.linkedin.kafka.cruisecontrol.detector.notifier.SelfHealingNotifier
replica.count.balance.threshold=1.25
leader.replica.count.balance.threshold=1.25
com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.ReplicaDistributionGoal
com.linkedin.kafka.cruisecontrol.analyzer.goals.LeaderReplicaDistributionGoal
Other configurations can remain as set by default.
kafka_assigner
parameter is set to true in the corresponding request
(for example, with the rebalance request as shown in the Cruise Control documentation).