Streaming Data into HCP Overview
The first task in adding a new telemetry data source is to stream all raw events from that source into its own Kafka topic.
Although HCP includes parsers for several data sources (for example, Bro, Snort, and YAF), you must still stream the raw data into HCP through a Kafka topic.
If you choose to use the Snort telemetry data source, you must meet the following configuration requirements:
-
When you install and configure Snort, to ensure proper functioning of indexing and analytics, configure Snort to include the year in the timestamp by modifying the
snort.conf
file as follows:# Configure Snort to show year in timestamps config show_year
-
By default, the Snort parser is configured to use
ZoneId.systemDefault()
for the sourcetimeZone
for the incoming data and MM/dd/yy-HH:mm:ss.SSSSSS as the defaultdateFormat
. Valid timezones are defined in Java's ZoneId.getAvailableZoneIds(). DateFormats should use the options defined in https://docs.oracle.com/javase/8/docs/api/java/time/format/DateTimeFormatter.html. The following sample configuration shows thedateFormat
andtimeZone
values explicitly set in the parser configuration:"parserConfig": { "dateFormat" : "MM/dd/yy-HH:mm:ss.SSSSSS", "timeZone" : "America/New_York"
Depending on the type of data you are streaming into HCP, you can use one of the following methods:
-
NiFi
This streaming method works for most types of data sources. To use it with HCP, you must install it manually on port 8089. For information on installing NiFi, see the NiFi documentation.
ImportantNiFi cannot be installed on top of HDP, so you must install NiFi manually to use it with HCP.
-
Performant network ingestion probes
This streaming method is ideal for streaming high-volume packet data.
-
Real-time and batch threat intelligence feed loaders
This streaming method works for intelligence feeds that you want to view in real-time or collect batches of information to view or query at a later date.