Impala Authorization

Authorization determines which users are allowed to access which resources, and what operations they are allowed to perform. You use Apache Ranger to enable and manage authorization in Impala.

You set up privileges through the GRANT and REVOKE statements in either Impala or Hive. Then both components use those same privileges automatically.

By default, when authorization is not enabled, Impala does all read and write operations with the privileges of the impala user, which is suitable for a development/test environment but not for a secure production environment. When authorization is enabled, Impala uses the OS user ID of the user who runs impala-shell or other client programs, and associates various privileges with each user.

Privilege Model

Privileges can be granted on different objects in the schema and are associated with a level in the object hierarchy. If a privilege is granted on a parent object in the hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other database systems.

The objects in the Impala schema hierarchy are:

Server
    URI
    Database
        Table
            Column

The table-level privileges apply to views as well. Anywhere you specify a table name, you can specify a view name instead.

You can specify privileges for individual columns.

The table below lists the minimum level of privileges and the scope required to execute SQL statements in Impala. The following notations are used:
  • ANY denotes the SELECT, INSERT, CREATE, or REFRESH privilege.
  • ALL privilege denotes the SELECT, INSERT, CREATE, and REFRESH privileges.
  • The owner of an object effectively has the ALL privilege on the object.
  • The parent levels of the specified scope are implicitly supported. For example, if a privilege is listed with the TABLE scope, the same privilege granted on DATABASE and SERVER will allow the user to execute that specific SQL statement on TABLE.

For example, to be able to execute CREATE VIEW, you need the CREATE privilege on the database and the SELECT privilege on the source table.

SQL Statement Privileges Object Type /

Resource Type

SELECT SELECT TABLE
WITH SELECT SELECT TABLE
EXPLAIN SELECT SELECT TABLE
INSERT INSERT TABLE
EXPLAIN INSERT INSERT TABLE
TRUNCATE INSERT TABLE
LOAD INSERT TABLE
ALL URI
CREATE DATABASE CREATE SERVER
CREATE DATABASE LOCATION CREATE SERVER
ALL URI
CREATE TABLE CREATE DATABASE
CREATE TABLE LIKE CREATE DATABASE
SELECT, INSERT, or REFRESH TABLE
CREATE TABLE AS SELECT CREATE DATABASE
INSERT DATABASE
SELECT TABLE
EXPLAIN CREATE TABLE AS SELECT CREATE DATABASE
INSERT DATABASE
SELECT TABLE
CREATE TABLE LOCATION CREATE TABLE
ALL URI
CREATE VIEW CREATE DATABASE
SELECT TABLE
ALTER DATABASE SET OWNER ALL WITH GRANT DATABASE
ALTER TABLE ALL TABLE
ALTER TABLE SET LOCATION ALL TABLE
ALL URI
ALTER TABLE RENAME CREATE DATABASE
ALL TABLE
ALTER TABLE SET OWNER ALL WITH GRANT TABLE
ALTER VIEW ALL TABLE
SELECT TABLE
ALTER VIEW RENAME CREATE DATABASE
ALL TABLE
ALTER VIEW SET OWNER ALL WITH GRANT VIEW
DROP DATABASE ALL DATABASE
DROP TABLE ALL TABLE
DROP VIEW ALL TABLE
CREATE FUNCTION CREATE DATABASE
ALL URI
DROP FUNCTION ALL DATABASE
COMPUTE STATS ALL TABLE
DROP STATS ALL TABLE
INVALIDATE METADATA REFRESH SERVER
INVALIDATE METADATA <table> REFRESH TABLE
REFRESH <table> REFRESH TABLE
REFRESH AUTHORIZATION REFRESH SERVER
REFRESH FUNCTIONS REFRESH DATABASE
COMMENT ON DATABASE ALL DATABASE
COMMENT ON TABLE ALL TABLE
COMMENT ON VIEW ALL TABLE
COMMENT ON COLUMN ALL TABLE
DESCRIBE DATABASE SELECT, INSERT, or REFRESH DATABASE
DESCRIBE <table/view>

If the user has the SELECT privilege at the COLUMN level, only the columns the user has access will show.

SELECT, INSERT, or REFRESH TABLE
SELECT COLUMN
USE ANY TABLE
SHOW DATABASES ANY TABLE
SHOW TABLES ANY TABLE
SHOW FUNCTIONS SELECT, INSERT, or REFRESH DATABASE
SHOW PARTITIONS SELECT, INSERT, or REFRESH TABLE
SHOW TABLE STATS SELECT, INSERT, or REFRESH TABLE
SHOW COLUMN STATS SELECT, INSERT, or REFRESH TABLE
SHOW FILES SELECT, INSERT, or REFRESH TABLE
SHOW CREATE TABLE SELECT, INSERT, or REFRESH TABLE
SHOW CREATE VIEW SELECT, INSERT, or REFRESH TABLE
SHOW CREATE FUNCTION SELECT, INSERT, or REFRESH DATABASE
SHOW RANGE PARTITIONS (Kudu only) SELECT, INSERT, or REFRESH TABLE
UPDATE (Kudu only) ALL TABLE
EXPLAIN UPDATE (Kudu only) ALL TABLE
UPSERT (Kudu only) ALL TABLE
WITH UPSERT (Kudu only) ALL TABLE
EXPLAIN UPSERT (Kudu only) ALL TABLE
DELETE (Kudu only) ALL TABLE
EXPLAIN DELETE (Kudu only) ALL TABLE

The privileges not listed in the table above will be silently ignored by Impala.

Changing Privileges in Impala

Privileges are managed via the GRANT and REVOKE SQL statements that require the Ranger service enabled.

Privileges can be also managed in Ranger UI. Especially, for attribute-based access control, Ranger UI is required to manage authorization.

Impala authorization policies are listed in the Hive service section in Ranger UI.

REFRESH AUTHORIZATION is not required when you make the changes to privileges within Impala. The changes are automatically propagated.

Changing Privileges from Outside of Impala

If you make a change to privileges in Ranger from outside of Impala, e.g. adding a user, removing a user, modifying privileges, there are two options to propagate the change:

  • Use the ranger.plugin.hive.policy.pollIntervalMs property to specify how often to do a Ranger refresh. The property is specified in ranger-hive-security.xml in the conf directory under your Impala home directory.
  • Run the INVALIDATE METADATA or REFRESH AUTHORIZATION statement to force a refresh.

Granting Privileges on URI

URIs represent the file paths you specify as part of statements such as CREATE EXTERNAL TABLE and LOAD DATA. Typically, you specify what look like UNIX paths, but these locations can also be prefixed with hdfs:// to make clear that they are really URIs. To set privileges for a URI, specify the name of a directory, and the privilege applies to all the files in that directory and any directories underneath it.

URIs must start with hdfs://, s3a://, adl://, or file://. If a URI starts with an absolute path, the path will be appended to the default filesystem prefix. For example, if you specify:

GRANT ALL ON URI '/tmp';
The above statement effectively becomes the following where the default filesystem is HDFS.

GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
When defining URIs for HDFS, you must also specify the NameNode. For example:
GRANT ALL ON URI file:///path/to/dir TO <role>
GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role>