Authorizing Apache Hive Access

As administrator, you need to understand that the Hive default authorization for running Hive queries is insecure and what you need to do to secure your data. You need to know your security options: to set up Ranger or Storage Based Authorization (SBA), which is based on impersonation and HDFS access control lists (ACLs), or a combination of these methods.

To limit Apache Hive access to approved users. Cloudera recommends Ranger. Authorization is the process that checks user permissions to perform select operations, such as creating, reading, and writing data, as well as editing table metadata. Apache Ranger provides centralized authorization for all Cloudera Runtime Services.

You can set up Ranger to protect managed, ACID tables or external tables using a Hadoop SQL policy. You can protect external table data on the file system by using an HDFS policy in Ranger.

You can set up SBA plus HDFS ACLs to protect external tables and external table data. Storage-based authorization (SBA) does not work for giving users access to ACID tables.

Preloaded Ranger Policies

In Ranger, preloaded Hive policies are available by default. Users covered by these policies can perform Hive operations. All users need to use the default database, perform basic operations such as listing database names, and query the information schema. To provide this access, preloaded default database tables columns and information_schema database policies are enabled for group public (all users). Keeping these policies enabled for group public is recommended. For example, if the default database tables columns policy is disabled preventing use of the default database, the following error appears:

hive> USE default;
Error: Error while compiling statement: FAILED: HiveAccessControlException
Permission denied: user [hive] does not have [USE] privilege on [default]

Apache Ranger policy authorization

Apache Ranger provides centralized policy management for authorization and auditing of all Cloudera Runtime services, including Hive. All Cloudera Runtime services are installed with a Ranger plugin used to intercept authorization requests for that service, as shown in the following illustration.

Authorizing Hive through Ranger instead of using SBA is highly recommended.

Storage based authorization

Storage-based authorization is LDAP-based. As the name implies, storage-based authorization relies on the authorization provided by the storage layer. The storage layer is HDFS, which provides both POSIX and ACL permissions. Hive is one of many Cloudera Runtime services that share storage on HDFS. The model controls access to metadata and checks permissions on the corresponding directories of the HDFS file system. Traditional POSIX permissions for the HDFS directories where tables reside determine access to those tables. This authorization model doesn't support column-level security or giving users access to ACID tables.

In addition to the traditional POSIX permissions model, HDFS also provides ACLs, or access control lists, as described in ACLs on HDFS. An ACL consists of a set of ACL entries, and each entry names a specific user or group and grants or denies read, write, and execute permissions for the specified user or group. These ACLs are also based on POSIX specifications, and they are compatible with the traditional POSIX permissions model.

HDFS ACL permissions provide administrators with authentication control over databases, tables, and table partitions on the HDFS file system. For example, an administrator can create a role with a set of grants on specific HDFS tables, then grant the role to a group of users. Roles allow administrators to easily reuse permission grants. Cloudera recommends relying on POSIX permissions and a small number of ACLs to augment the POSIX permissions for exceptions and edge cases.

A file with an ACL incurs additional memory cost to the NameNode due to the alternate algorithm used for permission checks on such files.

HDFS permissions

SBA relies heavily on HDFS access control lists (ACLs). ACLs are an extension to the permissions system in HDFS. CDP Private Cloud Base turns on ACLs in HDFS by default, providing you with the following advantages:
  • Increased flexibility when giving multiple groups and users specific permissions
  • Convenient application of permissions to a directory tree rather than by individual files

Authorization model comparison

In addition to Apache Ranger, Hive supports storage-based authorization (SBA) for external tables. SBA does not work for giving users access to ACID tables. Ranger and SBA can co-exist in CDP Private Cloud Base. The following table compares authorization models:

Authorization model


Fine-grained authorization (column, row level)

Privilege management using GRANT/REVOKE statements

Centralized management GUI

Apache Ranger







No authorization at SQL layer in HiveServer. Provides Metastore server authorization for the Metastore API only.

No. Table privilege based on HDFS permission


Hive default

Not secure. No restriction on which users can run GRANT statements




When you run grant/revoke commands and Apache Ranger is enabled, a Ranger policy is created/removed.