Parser Tuning Example
We'll be using the Bro sensor in this example. The parsers and PCAP use a builder utility, as opposed to enrichments and indexing, which use Flux.
The following example of parser tuning starts with a single partition for the inbound Kafka topics and eventually increases to 48 partitions.
In the
storm-bro.config
file, set thetopology.max.spout
pending value to 2000.{ ... "topology.max.spout.pending" : 2000 ... }
The default is null which would result in no limit.
In the
spout-bro.config
file, set the following settings to use the default values:{ ... "spout.pollTimeoutMs" : 200, "spout.maxUncommittedOffsets" : 10000000, "spout.offsetCommitPeriodMs" : 30000 }
Because we are using the default settings, you can optionally omit these settings.
Run the Bro parser topology with the following options:
$METRON_HOME/bin/start_parser_topology.sh \ -e ~metron/.storm/storm-bro.config \ -esc ~/.storm/spout-bro.config \ -k $BROKERLIST \ -ksp SASL_PLAINTEXT \ -nw 1 \ -ot enrichments \ -pnt 24 \ -pp 24 \ -s bro \ -snt 24 \ -sp 24 \ -z $ZOOKEEPER \
This example does not fully match the number of Kafka partitions with the parallelism in this case, though you could do so if necessary. Notice that the example only needs one worker.
From the usage docs, here are the options used.
usage: start_parser_topology.sh -e,--extra_topology_options <JSON_FILE> Extra options in the form of a JSON file with a map for content. -esc,--extra_kafka_spout_config <JSON_FILE> Extra spout config options in the form of a JSON file with a map for content. Possible keys are: retryDelayMaxMs,retryDelay Multiplier,retryInitialDel ayMs,stateUpdateIntervalMs ,bufferSizeBytes,fetchMaxW ait,fetchSizeBytes,maxOffs etBehind,metricsTimeBucket SizeInSecs,socketTimeoutMs -k,--kafka <BROKER_URL> Kafka Broker URL -ksp,--kafka_security_protocol <SECURITY_PROTOCOL> Kafka Security Protocol -nw,--num_workers <NUM_WORKERS> Number of Workers -ot,--output_topic <KAFKA_TOPIC> Output Kafka Topic -pnt,--parser_num_tasks <NUM_TASKS> Parser Num Tasks -pp,--parser_p <PARALLELISM_HINT> Parser Parallelism Hint -s,--sensor <SENSOR_TYPE> Sensor Type -snt,--spout_num_tasks <NUM_TASKS> Spout Num Tasks -sp,--spout_p <SPOUT_PARALLELISM_HINT> Spout Parallelism Hint -z,--zk <ZK_QUORUM> ZooKeeper Quroum URL (zk1:2181,zk2:2181,...