General Tuning Suggestions

Tuning Cloudera Cybersecurity Platform (CCP) depends in large part on tuning three areas: Kafka, Storm, and indexing.

Indexing is where most of your tuning issues are likely to occur because it is the most IO intensive.

The second area that needs tuning is parallelism in both Kafka and Storm. The performance of the Storm topology and therefore the performance of Metron, degrades when it cannot ingest data fast enough to keep up with the data source. Therefore, much of Metron tuning focuses on adjusting the data throughput of the Storm topologies. For more information on tuning a Storm topology, see Apache Storm Overview.

The third area that requires analysis and tuning is consumer lags on the key Kafka topics: enrichment, indexing, parser.

When tuning your Metron configuration, consider the following:

  • Look at Elasticsearch and Solr tuning

  • Assign small values for parallelism, and increase values incrementally

  • Aim for an even balance across your topologies

  • Check your system logs for the following:

    • Empty results - may indicate that your data is broken

    • Kafka - Consumer lags on key Kafka topics

    • Load average or system latency - a high load average might indicate underlying stress on the machine

    • Exceptions - Any exceptions shown in the Storm log or key topologies can indicate possible problems with underlying systems and data

  • What topology do I want to tune?

  • What is the capacity of Storm topology?

It is also important to consider the growth of your cluster and data flow. You might want to set the number of tasks higher than the number of executors to accommodate for future performance tuning and rebalancing without the need to bring down your topologies.