Configuring the Activity Profiler

In addition to the generic configuration, there are additional parameters for the Activity Profiler that can optionally be edited. You can configure the scheduling and the available resources for your profiler.

  1. Go to Profilers and select your data lake.
  2. Go to Profilers > Activity Profiler > Profiler Details > Configuration > All Configurations
  3. Select a schedule to run profiler using either UNIX Cron Expression or the Basic scheduler.
    Figure 1. Profiler schedule with cron expression
    Figure 2. Profiler schedule with natural language
  4. Continue with the resource settings:
    1. Set the Maximum number of executors

      Indicates the number of processes that are used by the distributed computing framework. The recommended value is at least four executors.

    2. Set the Maximum cores per executor

      Indicates the maximum number of cores that can be allocated to an executor.

    3. Set the Executor memory limit in GBs
    4. Set the Number of driver cores

      Indicates the maximum number of driver cores. Increase the number of cores to improve the speed of profiler job scheduling.

    5. Set the Maximum driver memory in GBs

      Indicates the maximum amount of memory that can be allocated to an driver core. Increasing the available memory accelerates the profiling of larger and more complex tables and prevents out-of-memory errors.

  5. Click Save to apply the configuration changes to the selected profiler.