Chapter 2. General Tuning Suggestions
Tuning Hortonworks Cybersecurity Platform (HCP) depends in large part on tuning three areas: Kafka, Storm, and indexing.
Indexing is where most of your tuning issues are likely to occur because it is the most IO intensive.
The second area that needs tuning is parallelism in both Kafka and Storm. The performance of the Storm topology. and therefore the performance of Metron, degrades when it cannot ingest data fast enough to keep up with the data source. Therefore, much of Metron tuning focuses on adjusting the data throughput of the Storm topologies. For more information on tuning a Storm topology, see Tuning an Apache Storm Topology.
The third area that requires analysis and tuning is consumer lags on the key Kafka topics: enrichment, indexing, parser
When tuning your Metron configuration, consider the following:
Look at Elasticsearch and Solr tuning
Assign small values for parallelism, and increase values incrementally
Aim for an even balance across your topologies
Check your system logs for the following:
Empty results - may indicate that your data is broken
Kafka - Consumer lags on key Kafka topics
Load average or system latency - a high load average might indicate underlying stress on the machine
Exceptions - Any exceptions shown in the Storm log or key topologies can indicate possible problems with underlying systems and data
What topology do I want to tune?
What is the capacity of Storm topology?
It is also important to consider the growth of your cluster and data flow. You might want to set the number of tasks higher than the number of executors to accommodate for future performance tuning and rebalancing without the need to bring down your topologies.