Configuring query audit logs to include caller context

Learn how you can configure Hive to audit partitions that are scanned as part of a Hive query. The audit information comprises the Query ID and User ID, which helps in meeting compliance requirements, such as controlling user access to data from specific regions or access only to particular time periods.

The audit information that is logged is stored in the HDFS audit log entry. You can access the audit logs from either Cloudera Manager or through the command line:
  • From Cloudera Manager

    Access the log file from Clusters > HDFS-1 > NameNode Web UI > Utilities > logs > hdfs-audit.log

  • From the command line:

    Access the log file from /var/log/hadoop-hdfs/hdfs-auidt.log

  1. Log in to Cloudera Manager as an administrator and go to Clusters > HDFS-1.
  2. In the HDFS-1 service page, click the Configuration tab and search for the "hadoop.caller.context.enabled" property.
  3. Enable the property to allow the caller context to be included in the audit logs.
    This allows additional fields to be written into HDFS NameNode audit log records for auditing coarse granularity operations.
  4. Click Save Changes and restart the HDFS-1 service.
  5. After the service restarts, run your Hive queries.
You can view the audit information of the queries by accessing the hdfs-audit.log file.