Running the Enrichment Loader
After the enrichment source and enrichment configuration are defined, you must run the loader to move the data from the enrichment source to the HCP enrichment store (HBase) and store the enrichment configuration in ZooKeeper.
Note | |
---|---|
There is a special configuration parameter to the Extractor config that is only considered during this loader:
The default is |
Use the loader to move the enrichment source to the enrichment store in ZooKeeper.
Perform the following from the location containing your extractor and enrichment configuration files and your enrichment source. In our example, this information is located at
$METRON_HOME/config
.$METRON_HOME/bin/flatfile_loader.sh -n enrichment_config.json -i whois_ref.csv -t enrichment -c t -e extractor_config.json
The parameters for the utility are as follows:
Short Code Long Code Required Description -h No Generate the help screen/set of options -e --extractor_config Yes JSON document describing the extractor for this input data source -t --hbase_table Yes The HBase table to import into -c --hbase_cf Yes The HBase table column family to import into -i --input Yes The input data location on local disk. If this is a file, then that file will be loaded. If this is a directory, then the files will be loaded recursively under that directory. -l --log4j No The log4j properties file to load -n --enrichment_config No The JSON document describing the enrichments to configure. Unlike other loaders, this is run first if specified. HCP loads the enrichment data into Apache HBase and establishes a ZooKeeper mapping. The data is extracted using the extractor and configuration defined in the
extractor_config.json
file and populated into an HBase table calledenrichment
.Verify that the logs were properly ingested into HBase:
hbase shell scan 'enrichment'
Verify that the ZooKeeper enrichment tag was properly populated:
$METRON_HOME/bin/zk_load_configs.sh -m DUMP -z $ZOOKEEPER_HOST:2181
Generate some data by using the Squid client to execute requests.
Use ssh to access the host for Squid.
Start Squid and navigate to
/var/log/squid
:sudo service squid start sudo su - cd /var/log/squid
Generate some data by entering the following:
squidclient http://www.cnn.com