In addition to turning on and off the profiler configurations, the individual
profilers can be run with their own execution parameters. These parameters are for
submission of the profiler job onto Spark. You can edit the configuration of profilers and update these parameters to run profiler jobs.
-
Click Profilers in the main navigation menu on the
left.
-
Click Configs to view all of the configured
profilers.
- Select the cluster for which you need to edit profiler configuration.
The list of profilers for the selected clusters is displayed.
- Click the name of the profiler whose configuration you wish to edit.
The Profiler Configuration tab is displayed in the right panel.
-
Select a schedule to run the profiler. This is implemented as a quartz cron
expression.
-
Select Last Run Check.
Last Run Check configuration enables profilers like Hive Column
Statistics and Cluster Sensitivity Profiler to avoid profiling
the same asset on each scheduled run. If you have scheduled a cron job, say for
about an hour, and have enabled the Last Run Check configuration for two
days, this set-up ensures that the job scheduler filters out any asset which was
already profiled in the last two days.
If the Last Run Check configuration is disabled,
assets will be picked up for profiling as per the job cron schedule, honoring
the asset filter rules, if any.
- Update the advanced options.
- Number of Executors - Enter the number of executors to launch for running this profiler.
- Executor Cores - Enter the number of cores to be used for each executor.
- Executor Memory - Enter the amount of memory in GB to be used per executor process.
- Driver Cores - Enter the number of cores to be used for the driver process.
- Driver Memory - Enter the memory to be used for the driver processes.
- Toggle the state of the profiler from Active to Inactive as needed.
- Click Save to apply the configuration changes to the selected profiler. The changes should appear in the profiler description.