Configure the Batch Profiler
You can customize the batch profiler to specify where the profiler runs, the format of the telemetry input, and various other properties available in the batch profiler properties file.
To run the batch profiler using Spark on Yarn, specify the location in
spark.master=yarnBy default, the batch profiler instructs Spark to run in local mode:
spark.master=local. This mode is only useful for testing with a limited set of data.
You might also want to set the YARN deploy mode to cluster:
spark.submit.deployMode=clusterIn cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application. See the Spark documentation for more information.
Specify the appropriate input format for the batch profiler to consume by modifying
profiler.batch.input.format=text profiler.batch.input.path=hdfs://localhost:8020/apps/metron/indexing/indexed/*/*The Profiler can consume archived telemetry stored in a variety of input formats. By default, it is configured to consume the text/json that CCP archives in HDFS. This is often not the best format for archiving telemetry.
Review the batch profiler's properties located at
$METRON_HOME/config/batch-profiler.propertiesto customize them for your profiler needs.See Batch Profiler Properties for more information on these properties.