HDFS ACL permissions model

As administrator, you must understand the permissions model supported in CDP Data Center and later. If you do not use Ranger for security, you need to add users to an HDFS access control list to permit access to the Hive warehouse for running DML queries.

Hive 3 supports the HDFS access control model instead of the past Hive permission inheritance based on the hive.warehouse.subdir.inherit.perms parameter setting. In Hive 3, a directory inherits permissions from the Default ACL. The Default ACL serves as a template from which Access ACLs for subdirectories and files are built. The Access ACL manages user access to child directories and files derived from the Default ACL.

Modifications to the Default ACL of a parent are propagated to the Access ACL or Default ACL of new children only. Existing children are unchanged.

The structures of Default ACLs and Access ACLs are identical:
Entity Type Entity Permissions
Owners Owning user r-w-x
Owning group r-w-x
Named groups and users marketing r-w-x
jane r-w-x
Unnamed groups and users other r-w-x
Not applicable mask r-w-x

HDFS permissions

The following table describes read, write, and execute permissions on directories and files:
File Directory
Read (r) Able to read a file Requires r-x to list the directory contents
Write (w) Able to write or append to a file Requires r-x to create subdirectories
Execute (e) Able to parse and run file commands Requires x to traverse the directory

Permission examples

Consider an example in which you want the sales group to access contents of a table for all Hive operations. In this case, you must set a Default ACL permissions for the group as - default:group:sales:rwx. The default mask on the directories restricts the permissions granted using Default ACLs. The Default ACL for the hive group is default:group:hive:rwx. This mask gives read, write, and execute access to the hive group and sets permissions on the base directory of databases.

The following example shows directory and file permissions required to read sales_report.txt:

The following example shows directory and file permissions required to append to sales_report.txt:

The following example shows directory permissions required to delete sales_report.txt:

Impersonating the user to control access to YARN queues

As administrator, if you do not use the recommended Ranger security, you can enable the security-based authorization (SBA) doAs impersonation parameter to control access to YARN queues. You use the Cloudera Manager Safety Valve featuer to enable the following property: hive.server2.enable.doAs. In Cloudera Manager, in HiveServer2 Enable Impersonation, you select the HIVE_ON_TEZ-1 service.

Click HIVE_ON_TEZ-1 > Configuration and search for hive.server2.enable . . .

By enabling impersonation, you can allow HiveServer to authorize access to YARN queues for the original user who submitted the query while running the Tez application as the hive user. You configure parameters as follows:

  • To disable impersonation. doAs=false (the Hive default and recommended setting)

    Result: The Tez app is submitted as hive, and no access check for the user, for example joe, is performed.

  • To enable impersonation. doAs=true

    When a user, for example joe, submits the query through HiveServer to the YARN queue, for example y_q, the Tez app is started for joe and access to y_q is checked for this user.