Ranger policy options for RAZ-enabled AWS environment
After you register the RAZ-enabled AWS environment, you can log in to Ranger to create the policies for granular access to the environment’s cloud storage location.
Ranger includes a set of preloaded resource-based services and
policies. You need the following additional policies for granular access to the
environment’s cloud storage location:
- Policies for Spark jobs
- A Spark job on an S3 path requires an S3 policy for the end user on the specific path. For information to create the policies, see Creating Ranger policy to use in RAZ-enabled AWS environment.
- Policies for Hive external tables and Spark jobs
-
Running the
create external table [***table definition***] location ‘s3a://bucket/data/logs/tabledata’
command in Hive requires the following Ranger policies:- An S3 policy in the cm_s3 repo on s3a://bucket/data/logs/tabledata for hive user to perform recursive read/write.
- An S3 policy in the cm_s3 repo on s3a://bucket/data/logs/tabledata for the end user.
- A Hive URL authorization policy in the Hadoop SQL repo on s3a://bucket/data/logs/tabledata for the end user.
Access to the same external table location using Spark shell requires an S3 policy (Ranger policy) in the cm_s3 repo on s3a://bucket/data/logs/tabledata for the end user.
- Policies for Hive managed tables and Spark jobs
- Operations such as create, insert, delete, select, and so on, on a Hive managed table do not require any custom Ranger policies.