Run the Enrichment Loader

After the enrichment source and enrichment configuration are defined, you must run the loader to move the data from the enrichment source to the CCP enrichment store (HBase) and store the enrichment configuration in ZooKeeper.

  1. If you will be bulk loading enrichments using the MR mode, add the metron user to the access control list of users and groups in the Yarn Queue Manager.
    You do not need to perform this step if you use the LOCAL mode.
    1. In Ambari, click (Ambari views) and choose Yarn Queue Manager.
    2. In the upper left corner of the window, under Add Queue, select default.
      Ambari displays the default information on the right side of the window.
    3. By Submit Applications, click Custom.
    4. In the Users field, add metron.
      For example:
  2. Use the loader to move the enrichment source to the enrichment store in ZooKeeper.
    Perform the following from the location containing your extractor and enrichment configuration files and your enrichment source. In our example, this information is located at $METRON_HOME/config.
    $METRON_HOME/bin/ -n enrichment_config.json -i whois_ref.csv -t enrichment -c t -e 
    The parameters for the utility are as follows:
    Short Code Long Code Required Description
    -h No Generate the help screen/set of options
    -e --extractor_config Yes JSON document describing the extractor for this input data source
    -t --hbase_table Yes The HBase table to import into
    -c --hbase_cf Yes The HBase table column family to import into
    -i --input Yes The input data location on local disk. If this is a file, then that file will be loaded. If this is a directory, then the files will be loaded recursively under that directory.
    -l --log4j No The log4j properties file to load
    -n --enrichment_config No The JSON document describing the enrichments to configure. Unlike other loaders, this is run first if specified.
    CCP loads the enrichment data into Apache HBase and establishes a ZooKeeper mapping. The data is extracted using the extractor and configuration defined in the extractor_config.json file and populated into an HBase table called enrichment.
  3. Verify that the logs were properly ingested into HBase:
    hbase shell
    scan 'enrichment'
  4. Verify that the ZooKeeper enrichment tag was properly populated:
  5. Generate some data by using the Squid client to execute requests.
    1. Use ssh to access the host for Squid.
    2. Start Squid and navigate to /var/log/squid:
      sudo service squid start
      sudo su - 
      cd /var/log/squid
    3. Generate some data by entering the following: