Chapter 5. Moving Data Into and Out of Apache Storm Using Spouts and Bolts
This chapter focuses on moving data into and out of Apache Storm through the use of spouts and bolts. Spouts read data from external sources to ingest data into a topology. Bolts consume input streams and process the data, emit new streams, or send results to persistent storage. This chapter focuses on bolts that move data from Storm to external sources.
The following spouts are available in HDP 2.5:
Kafka spout based on Kafka 0.7.x/0.8.x, plus a new Kafka consumer spout available as a technical preview (not for production use)
HDFS
EventHubs
Kinesis (technical preview)
The following bolts are available in HDP 2.5:
Kafka
HDFS
EventHubs
HBase
Hive
JDBC (supports Phoenix)
Solr
Cassandra
MongoDB
ElasticSearch
Redis
OpenTSDB (technical preview)
Supported connectors are located at /usr/lib/storm/contrib
. Each
contains a .jar file containing the connector's packaged classes and dependencies, and
another .jar file with javadoc reference documentation.
This chapter describes how to use the Kafka spout, HDFS spout, Kafka bolt, Storm-HDFS connector, and Storm-HBase connector APIs. For information about connecting to components on a Kerberos-enabled cluster, see Configuring Connectors for a Secure Cluster.