HDFS ACL permissions model
As administrator, you must understand the permissions model supported in CDP Data Center and later. If you do not use Ranger for security, you need to add users to an HDFS access control list to permit access to the Hive warehouse for running DML queries.
Hive 3 supports the HDFS access control model instead of the past Hive permission inheritance
based on the
hive.warehouse.subdir.inherit.perms parameter setting. In Hive 3, a
directory inherits permissions from the Default ACL. The Default ACL serves as a template
from which Access ACLs for subdirectories and files are built. The Access ACL manages user access
to child directories and files derived from the Default ACL.
Modifications to the Default ACL of a parent are propagated to the Access ACL or Default ACL of new children only. Existing children are unchanged.
|Named groups and users||marketing||r-w-x|
|Unnamed groups and users||other||r-w-x|
|Read (r)||Able to read a file||Requires r-x to list the directory contents|
|Write (w)||Able to write or append to a file||Requires r-x to create subdirectories|
|Execute (e)||Able to parse and run file commands||Requires x to traverse the directory|
Consider an example in which you want the sales group to access contents of a table for all
Hive operations. In this case, you must set a Default ACL permissions for the group as -
default:group:sales:rwx. The default mask on the directories restricts the
permissions granted using Default ACLs. The Default ACL for the hive group is
default:group:hive:rwx. This mask gives read, write, and execute access to the
hive group and sets permissions on the base directory of databases.
The following example shows directory and file permissions required to read sales_report.txt:
The following example shows directory and file permissions required to append to sales_report.txt:
The following example shows directory permissions required to delete sales_report.txt:
Impersonating the user to control access to YARN queues
As administrator, if you do not use the recommended Ranger security, you can enable the
security-based authorization (SBA)
doAs impersonation parameter to control
access to YARN queues. You use the Cloudera Manager Safety Valve featuer to enable the following
hive.server2.enable.doAs. In Cloudera Manager, in HiveServer2 Enable
Impersonation, you select the HIVE_ON_TEZ-1 service.
Clickand search for hive.server2.enable . . .
By enabling impersonation, you can allow HiveServer to authorize access to YARN queues for the
original user who submitted the query while running the Tez application as the
hive user. You configure parameters as follows:
- To disable impersonation.
doAs=false (the Hive default and recommended setting)
Result: The Tez app is submitted as
hive, and no access check for the user, for example
joe, is performed.
- To enable impersonation.
When a user, for example
joe, submits the query through HiveServer to the YARN queue, for example
y_q, the Tez app is started for joe and access to y_q is checked for this user.