Learn how to set up Ranger policies on a staging location. This location is used to
temporarily store Hive files that users need to read from Spark using the HWC secure access
mode.
Before using secure access mode to read Hive data, you must set up two Ranger S3
access policies on the staging location for the Hive and Spark session users, and
then set up a Hive URL authorization policy on the staging location.
As an example, let us consider S3 as the cloud storage service and assume the S3
staging location to be s3a://s3-hwc/stagingHWC/. In the
following procedure, as an administrator, you set up Ranger policies on files and
directories in the staging location. These policies secure managed ACID as well as
external tables.
In Cloudera Manager, click Hive on Tez > Configuration, search for hive.server2.enable.doAs, and, if
necessary, clear the selection to disable.
This step is required by Hive for managing ACID tables.
Log into the Ranger Admin UI and in the Service Manager
page, click the S3 preloaded resource-based service, for example,
cm_s3.
Click Add New Policy to create an access policy for the
end user who launches the spark-shell and initiates a Spark session.
In the Create Policy page, specify a policy name, S3
bucket name, and path of the staging location.
In the Allow Conditions section, select {USER} and add
the read and write permissions, and then click Add to create the
policy.
The Spark session users must have access to the staging location. This action
sets a single Ranger policy for all users.
The policy is created and displayed in the list of available S3 policies.
Click Add New Policy again to add another S3 policy that
grants the Hive user additional privileges on the staging path.
In the Create Policy page, specify the policy name, S3
bucket name, and path of the staging location.
In the Allow Conditions section, select hive as the user
and add the read and write permissions, and then click Add to create the
policy.
The policy is created and displayed in the list of available S3 policies.
Click the Service Manager link in the breadcrumb trail
and then click the Hadoop SQL preloaded resource-based
service.
Click Add New Policy to create the Hive URL
authorization policy.
In the Create Policy page, select
url from the drop-down list, and specify a policy
name and the S3 staging location.
In the Allow Conditions section, select {USER} and add
all permissions, and then click Add to create the
policy.
The policy is created and displayed in the list of available Hadoop SQL
policies.
You can use the HWC secure access mode to securely read Hive data from
Spark.