Stream grouping allows Storm developers to control how tuples are routed to bolts in a workflow. The following table describes the stream groupings available.
Table 1.2. Stream Groupings
Stream Grouping | Description |
---|---|
Shuffle | Sends tuples to bolts in random, round robin sequence. Use for atomic operations, such as math. |
Fields | Sends tuples to a bolt based on one or more fields in the tuple. Use to segment an incoming stream and to count tuples of a specified type. |
All | Sends a single copy of each tuple to all instances of a receiving bolt. Use to send a signal, such as clear cache or refresh state, to all bolts. |
Custom | Customized processing sequence. Use to get maximum flexibility of topology processing based on factors such as data types, load, and seasonality. |
Direct | Source decides which bolt receives a tuple. |
Global | Sends tuples generated by all instances of a source to a single target instance. Use for global counting operations. |
Storm developers specify the field grouping for each bolt using methods on the TopologyBuilder.BoltGetter
inner class, as shown in the following excerpt from the the WordCountTopology.java
example included with storm-starter
.
... TopologyBuilder builder = new TopologyBuilder(); builder.setSpout("spout", new RandomSentenceSpout(), 5); builder.setBolt("split", new SplitSentence(), 8).shuffleGrouping("spout"); builder.setBolt("count", new WordCount(), 12).fieldsGrouping("split", new Fields("word")); ...
The first bolt uses shuffle grouping to split random sentences generated with the RandomSentenceSpout
.
The second bolt uses fields grouping to segment and perform a count of individual words in the sentences.