Performance Tuning Overview

You can use these very high level steps to tune your CCP topologies. For more detailed performance tuning information, see the instructions for tuning a parser, enrichment topology, and indexing topology.

  1. Start the tuning process with a single worker.
    After tuning the bolts within a single worker, scale out with additional worker processes.
  2. Initially set the thread pool size to 1.
    Increase this value slowly only after tuning the other parameters first. Consider that each worker has its own thread pool and the total size of this thread pool should be far less than the total number of cores available in the cluster.
  3. Initially set each bolt parallelism hint to the number of partitions on the input Kafka topic.
    Monitor bolt capacity and increase the parallelism hint for any bolt whose capacity is close to or exceeds 1.
  4. If the topology is not able to keep-up with a given input, then increasing the parallelism is the primary means to scale up.
  5. Parallelism units can be used for determining how to distribute processing tasks across the topology.
    The sum of parallelism can be close to, but should not far exceed this value.
    (number of worker nodes in cluster * number cores per worker node) - (number of acker tasks)
  6. The throughput that the topology is able to sustain should be relatively consistent.
    If the throughput fluctuates greatly, increase back pressure using topology.max.spout.pending.
    When you restart the topologies, ensure that the Kafka offset strategy is set to LATEST.