Apache Ranger AuditingPDF version

Ranger audit log event summarization

You can summarize Ranger audit events that differ only by timestamp to reduce the amount of events logged in a busy system.

When summarization is enabled, if a Ranger plugin logs consecutive audit events that differ only by timestamp, it coalesces all such events into a single event and set event_count to the number of events logged and event_dur_ms to the time difference in milliseconds between the first and the last event.

To enable this feature you must set the following properties in the Ranger plugin's configuration (For example, if you want to enable Audit Summary for Hive plugin, you need to add these configurations to the ranger-hive-audit.xml file in Hive configurations):
Configurations Description
xasecure.audit.provider.summary.enabled

To enable summarization set this property to true. This would cause audit messages to be summarized before they are sent to various sinks.

By default it is set to false, which means audit summarization is disabled.

xasecure.audit.provider.queue.size

If unspecified, this value defaults to 1048576, which means the queue is sized to store 1M (1024 * 1024) messages.

Note the difference in property name that controls the size of summary queue.

xasecure.audit.provider.summary.interval.ms

The maximum time interval at which messages would be summarized.

If unspecified, it defaults to 5000, which means 5 seconds.

Summarization Batch size

Note that regardless of this time interval while summarizing at most 100k messages at a time are considered for aggregation. Thus, if more than 100k messages are logged during this interval then similar messages could show up as multiple summarized audit messages even though they are logged within the configured time interval.

Currently, this value of 100k is not user configurable. It is mentioned here for better understanding of Summarization logic.