Setting up secure access mode in Datahub

Learn how to set up Ranger policies on a staging location. This location is used to temporarily store Hive files that users need to read from Spark using the HWC secure access mode.

Before using secure access mode to read Hive data, you must set up two Ranger S3 access policies on the staging location for the Hive and Spark session users, and then set up a Hive URL authorization policy on the staging location.

As an example, let us consider S3 as the cloud storage service and assume the S3 staging location to be s3a://s3-hwc/stagingHWC/. In the following procedure, as an administrator, you set up Ranger policies on files and directories in the staging location. These policies secure managed ACID as well as external tables.

  1. In Cloudera Manager, click Hive on Tez > Configuration, search for hive.server2.enable.doAs, and, if necessary, clear the selection to disable.
    This step is required by Hive for managing ACID tables.
  2. Log into the Ranger Admin UI and in the Service Manager page, click the S3 preloaded resource-based service, for example, cm_s3.
  3. Click Add New Policy to create an access policy for the end user who launches the spark-shell and initiates a Spark session.
  4. In the Create Policy page, specify a policy name, S3 bucket name, and path of the staging location.

    Creating a Ranger S3 policy for Spark user
  5. In the Allow Conditions section, select {USER} and add the read and write permissions, and then click Add to create the policy.

    Setting allow conditions for Ranger S3 policy for Spark user

    The Spark session users must have access to the staging location. This action sets a single Ranger policy for all users.

    The policy is created and displayed in the list of available S3 policies.

  6. Click Add New Policy again to add another S3 policy that grants the Hive user additional privileges on the staging path.
  7. In the Create Policy page, specify the policy name, S3 bucket name, and path of the staging location.

    Creating a Ranger S3 policy for hive user
  8. In the Allow Conditions section, select hive as the user and add the read and write permissions, and then click Add to create the policy.

    Setting allow conditions for Ranger S3 policy for Hive user

    The policy is created and displayed in the list of available S3 policies.

  9. Click the Service Manager link in the breadcrumb trail and then click the Hadoop SQL preloaded resource-based service.
  10. Click Add New Policy to create the Hive URL authorization policy.
  11. In the Create Policy page, select url from the drop-down list, and specify a policy name and the S3 staging location.

    Creating a Hive URL policy
  12. In the Allow Conditions section, select {USER} and add all permissions, and then click Add to create the policy.

    Setting allow conditions for URL policy

    The policy is created and displayed in the list of available Hadoop SQL policies.

You can use the HWC secure access mode to securely read Hive data from Spark.