Administration
Also available as:
PDF
loading table of contents...

General Purpose Parsers

The general purpose parser is primarily designed for lower-velocity topologies or for quickly setting up a temporary parser for a new telemetry. General purpose parsers are defined using a config file, and you need not recompile the topology to change them. HCP supports two general purpose parsers: Grok and CSV.

Grok parser

The Grok parser class name (parserClassName) is org.apache.metron,parsers.GrokParser.

Grok has the following entries and predefined patterns for parserConfig:

grokPath

The patch in HDFS (or in the Jar) to the Grok statement

patternLabel

The pattern label to use from the Grok statement

timestampField

The field to use for timestamp

timeFields

A list of fields to be treated as time

dateFormat

The date format to use to parse the time fields

timezone

The timezone to use. UTC is the default.

CSV Parser

The CSV parser class name (parserClassName) is org.apache.metron,parsers.csv.CSVParser

CSV has the following entries and predefined patterns for parserConfig:

timestampFormat

The date format of the timestamp to use. If unspecified, the parser assumes the timestamp is ms since UNIX epoch.

columns

A map of column names you wish to extract from the CSV to their offsets. For example, { 'name' : 1,'profession' : 3} would be a column map for extracting the 2nd and 4th columns from a CSV.

separator

The column separator. The default value is ",".