Cloudera Navigator Auditing Architecture
The Cloudera Navigator auditing component provides data auditing and access features. The architecture of the Cloudera Navigator auditing component is illustrated below.
When the Cloudera Navigator auditing component is configured, plug-ins that enable collection of audit events are added to the HDFS, HBase, and Hive (that is, the HiveServer2 and Beeswax servers) services. The plug-ins write the audit events to an audit log on the local filesystem. Cloudera Impala and Sentry record audit events directly in an audit log file.
The Cloudera Manager Agent monitors the audit log files and sends these events to the Navigator Audit Server. The Cloudera Manager Agent retries any event that it fails to transmit. As there is no in-memory transient buffer involved, once the audit events are written to the audit log file, they are guaranteed to be delivered (as long as filesystem is available). The Cloudera Manager Agent keeps track of current audit event offset in the audit log that it has successfully transmitted, so on any crash/restart it picks up the audit event from the last successfully sent position and resumes. Audit logs are rotated and the Cloudera Manager Agent follows the rotation of the log. The Agent also takes care of purging old audit logs once they have been successfully transmitted to the Navigator Audit Server. If a plug-in fails to write audit event to audit log file, it can either drop the event or shut down the process in which they are running (depending on the configured queue policy).
The Navigator Audit DB stores audit events.