Configuring the Activity Profiler

In addition to the generic configuration, you can configure scheduling and available resources for the Activity Profiler.

  1. Go to Profilers and select your data lake.
  2. Go to Profilers > Activity Profiler > Profiler Details > Configuration > All Configurations
  3. Select a schedule to run profiler using either UNIX Cron Expression or the Basic scheduler.
    Figure 1. Profiler schedule with cron expression
    Figure 2. Profiler schedule with natural language
  4. Configure the resources.
    1. Set the Maximum number of executors.

      Specifies the number of processes that are used by the distributed computing framework. The recommended value is at least four executors.

    2. Set the Maximum cores per executor.

      Specifies the maximum number of cores that can be allocated to an executor.

    3. Set the Executor memory limit in GBs.
    4. Set the Number of driver cores.

      Specifies the maximum number of driver cores. Increase the number of cores to improve the speed of profiler job scheduling.

    5. Set the Maximum driver memory in GBs.

      Specifies the maximum amount of memory that can be allocated to an driver core. Increasing the available memory accelerates the profiling of larger and more complex tables and prevents out-of-memory errors.

  5. Click Save to apply the configuration changes to the selected profiler.