Bulk Loading CSV Threat Intelligence Information
Metron is designed to work with STIX/Taxii threat feeds, but can also be bulk loaded with threat data from a CSV file. Command-separated values (CSV) is a simple file format used to store tabular data, such as a spreadsheet or database. Files in the CSV format can be imported to and exported from programs that store data in tables.
$METRON_HOME/bin/flatfile_loader.sh
reads data from the
local disk and loads the threat intelligence data into an HBase table. This loader uses the
special configuration parameter inputFormatHandler
to specify how to consider
the data. The two implementations are BY_LINE
and
org.apache.metron.dataloads.extractor.inputformat.WholeFileFormat
. The
default is BY_LINE
, which makes sense for a list of CSVs in which each line
indicates a unit of information to be imported. However, if you are importing a set of STIX
documents, then you want each document to be considered as input to the Extractor.