Configuring the Ranger Audit Profiler

In addition to the generic configuration, you can configure scheduling and available resources for the Ranger Audit Profiler.

  1. Go to Profilers and select your data lake.
  2. Go to Profilers > Configs.
  3. Select Ranger Audit Profiler.
    The Detail page is displayed.
  4. Click the toggle to enable or disable the profiler.
  5. Select a schedule to run the profiler using a Quartz cron expression.
  6. Set the Input block size.
  7. Continue with the Pod Configurations and set the Kubernetes job resources.

    Pod configurations specify the resources that would be allocated to a pod when the profiler job starts to run. As all profilers are submitted as Kubernetes jobs, you must decide if you want to add or reduce resources to handle workload of various sizes.

    • Pod CPU Limit – Specifies the maximum number of cores that can be allocated to a Pod. The accepted values range from one through eight.
    • Pod CPU Requirement – Specifies the minimum number of CPUs that will be allocated to a pod when it is provisioned. If the node where a pod is running has enough resources available, r a container can use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit. The value range is from 1 through 8.
    • Pod Memory Limit – Specifies the maximum amount of memory that can be allocated to a pod. The value range is from 1 through 256.
    • Pod Memory Requirement – Specifies the minimum amount of RAM that will be allocated to a pod when it is provisioned. If the node where a pod is running has enough resources available, a container can use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit. The value range is from 1 through 256.
  8. Update the Executor Configurations.

    Executor configurations specify the runtime configurations. These configurations must be changed if you are changing the pod configurations and when you require additional compute power.

      • Number of workers – Specifies the number of processes that are used by the distributed computing framework. The value range is from 1 through 8.
      • Number of threads per worker – Specifies the number of threads used by each worker to complete the job. The value range is from 1 through 8.
      • Worker Memory limit in GB – Enforces a memory usage threshold for a given worker. For example, if you have an 8 GB pod and 4 threads, the value of this parameter must be 2 GB. The value range is from 1 through 4.
  9. Click Save to apply the configuration changes to the selected profiler.