Impala Authorization
Authorization determines which users are allowed to access which resources, and what operations they are allowed to perform. You use Apache Ranger to enable and manage authorization in Impala.
You set up privileges through the GRANT
and
REVOKE
statements in either Impala or Hive. Then both
components use those same privileges automatically.
By default, when authorization is not enabled, Impala does all read and
write operations with the privileges of the impala
user,
which is suitable for a development/test environment but not for a secure
production environment. When authorization is enabled, Impala uses the OS
user ID of the user who runs impala-shell or other
client programs, and associates various privileges with each user.
Privilege Model
Privileges can be granted on different objects in the schema and are associated with a level in the object hierarchy. If a privilege is granted on a parent object in the hierarchy, the child object automatically inherits it. This is the same privilege model as Hive and other database systems.
The objects in the Impala schema hierarchy are:
Server
URI
Database
Table
Column
The table-level privileges apply to views as well. Anywhere you specify a table name, you can specify a view name instead.
You can specify privileges for individual columns.
- ANY denotes the
SELECT
,INSERT
,CREATE
, orREFRESH
privilege. - ALL privilege denotes the
SELECT
,INSERT
,CREATE
, andREFRESH
privileges. - The owner of an object effectively has the
ALL
privilege on the object. - The parent levels of the specified scope are implicitly supported. For example, if a privilege is listed with the TABLE scope, the same privilege granted on DATABASE and SERVER will allow the user to execute that specific SQL statement on TABLE.
For example, to be able to execute CREATE VIEW
, you
need the CREATE
privilege on the database and the
SELECT
privilege on the source table.
SQL Statement | Privileges | Object Type / Resource Type |
SELECT | SELECT | TABLE |
WITH SELECT | SELECT | TABLE |
EXPLAIN SELECT | SELECT | TABLE |
INSERT | INSERT | TABLE |
EXPLAIN INSERT | INSERT | TABLE |
TRUNCATE | INSERT | TABLE |
LOAD | INSERT | TABLE |
ALL | URI | |
CREATE DATABASE | CREATE | SERVER |
CREATE DATABASE LOCATION | CREATE | SERVER |
ALL | URI | |
CREATE TABLE | CREATE | DATABASE |
CREATE TABLE LIKE | CREATE | DATABASE |
SELECT, INSERT, or REFRESH | TABLE | |
CREATE TABLE AS SELECT | CREATE | DATABASE |
INSERT | DATABASE | |
SELECT | TABLE | |
EXPLAIN CREATE TABLE AS SELECT | CREATE | DATABASE |
INSERT | DATABASE | |
SELECT | TABLE | |
CREATE TABLE LOCATION | CREATE | TABLE |
ALL | URI | |
CREATE VIEW | CREATE | DATABASE |
SELECT | TABLE | |
ALTER DATABASE SET OWNER | ALL WITH GRANT | DATABASE |
ALTER TABLE | ALL | TABLE |
ALTER TABLE SET LOCATION | ALL | TABLE |
ALL | URI | |
ALTER TABLE RENAME | CREATE | DATABASE |
ALL | TABLE | |
ALTER TABLE SET OWNER | ALL WITH GRANT | TABLE |
ALTER VIEW | ALL | TABLE |
SELECT | TABLE | |
ALTER VIEW RENAME | CREATE | DATABASE |
ALL | TABLE | |
ALTER VIEW SET OWNER | ALL WITH GRANT | VIEW |
DROP DATABASE | ALL | DATABASE |
DROP TABLE | ALL | TABLE |
DROP VIEW | ALL | TABLE |
CREATE FUNCTION | CREATE | DATABASE |
ALL | URI | |
DROP FUNCTION | ALL | DATABASE |
COMPUTE STATS | ALL | TABLE |
DROP STATS | ALL | TABLE |
INVALIDATE METADATA | REFRESH | SERVER |
INVALIDATE METADATA <table> | REFRESH | TABLE |
REFRESH <table> | REFRESH | TABLE |
REFRESH AUTHORIZATION | REFRESH | SERVER |
REFRESH FUNCTIONS | REFRESH | DATABASE |
COMMENT ON DATABASE | ALL | DATABASE |
COMMENT ON TABLE | ALL | TABLE |
COMMENT ON VIEW | ALL | TABLE |
COMMENT ON COLUMN | ALL | TABLE |
DESCRIBE DATABASE | SELECT, INSERT, or REFRESH | DATABASE |
DESCRIBE <table/view> If the user has the SELECT privilege at the COLUMN level, only the columns the user has access will show. |
SELECT, INSERT, or REFRESH | TABLE |
SELECT | COLUMN | |
USE | ANY | TABLE |
SHOW DATABASES | ANY | TABLE |
SHOW TABLES | ANY | TABLE |
SHOW FUNCTIONS | SELECT, INSERT, or REFRESH | DATABASE |
SHOW PARTITIONS | SELECT, INSERT, or REFRESH | TABLE |
SHOW TABLE STATS | SELECT, INSERT, or REFRESH | TABLE |
SHOW COLUMN STATS | SELECT, INSERT, or REFRESH | TABLE |
SHOW FILES | SELECT, INSERT, or REFRESH | TABLE |
SHOW CREATE TABLE | SELECT, INSERT, or REFRESH | TABLE |
SHOW CREATE VIEW | SELECT, INSERT, or REFRESH | TABLE |
SHOW CREATE FUNCTION | SELECT, INSERT, or REFRESH | DATABASE |
SHOW RANGE PARTITIONS (Kudu only) | SELECT, INSERT, or REFRESH | TABLE |
UPDATE (Kudu only) | ALL | TABLE |
EXPLAIN UPDATE (Kudu only) | ALL | TABLE |
UPSERT (Kudu only) | ALL | TABLE |
WITH UPSERT (Kudu only) | ALL | TABLE |
EXPLAIN UPSERT (Kudu only) | ALL | TABLE |
DELETE (Kudu only) | ALL | TABLE |
EXPLAIN DELETE (Kudu only) | ALL | TABLE |
The privileges not listed in the table above will be silently ignored by Impala.
Changing Privileges in Impala
Privileges are managed via the GRANT
and
REVOKE
SQL statements that require the Ranger service
enabled.
Privileges can be also managed in Ranger UI. Especially, for attribute-based access control, Ranger UI is required to manage authorization.
Impala authorization policies are listed in the Hive service section in Ranger UI.
REFRESH AUTHORIZATION
is not required when you make the
changes to privileges within Impala. The changes are automatically
propagated.
Changing Privileges from Outside of Impala
If you make a change to privileges in Ranger from outside of Impala, e.g. adding a user, removing a user, modifying privileges, there are two options to propagate the change:
- Use the
ranger.plugin.hive.policy.pollIntervalMs
property to specify how often to do a Ranger refresh. The property is specified inranger-hive-security.xml
in theconf
directory under your Impala home directory. - Run the
INVALIDATE METADATA
orREFRESH AUTHORIZATION
statement to force a refresh.
Granting Privileges on URI
URIs represent the file paths you specify as part of statements such
as CREATE EXTERNAL TABLE
and LOAD
DATA
. Typically, you specify what look like UNIX paths, but
these locations can also be prefixed with hdfs://
to
make clear that they are really URIs. To set privileges for a URI,
specify the name of a directory, and the privilege applies to all the
files in that directory and any directories underneath it.
hdfs://
,
s3a://
, adl://
, or
file://
. If a URI starts with an absolute path, the
path will be appended to the default filesystem prefix. For example, if
you specify:
GRANT ALL ON URI '/tmp';
The above
statement effectively becomes the following where the default filesystem
is HDFS.
GRANT ALL ON URI 'hdfs://localhost:20500/tmp';
GRANT ALL ON URI file:///path/to/dir TO <role>
GRANT ALL ON URI hdfs://namenode:port/path/to/dir TO <role>