Introduction to Ranger RAZ in Cloudera Base on premises clusters

Learn how to support Amazon S3-compatible object stores through the Ranger Remote Authorization Service (RAZ).

Cloudera Base on premises defaults to HDFS and natively supports the Ozone object store. There was limited support for Amazon S3-compatible object stores with respect to executing workloads like Hive and Spark. Not all workloads were supported from an authorization perspective on Amazon S3-compatible object stores. The inconsistent policy framework created challenges for the Policy Administrator in terms of applying corporate policies to secure sensitive data. Managing data access across teams and individuals was an architectural challenge.

The Ranger RAZ resolves this challenge by using Apache Ranger policies to authorize access to Amazon S3-compatible object storage, similar to HDFS files.

Ranger RAZ allows Ranger access control policies to be applied to external object stores. This removes the need to maintain a separate set of policies for the external object stores. Currently, this applies to the following components:
  • Hive or Impala executors
  • Spark executors
  • Data Explorer for accessing the users home and other accessible directories or transferring files.
RAZ server obtains cloud credentials from IDBroker during initialization. RAZ provides client tokens to access a cloud storage object, a file or directory. RAZ enforces access control using Cloudera cluster identities through Ranger policies. The following architectural diagram shows how RAZ integrates and interacts with other components in a RAZ-enabled Amazon S3 environment:
Figure 1. RAZ architecture


RAZ supports use cases that require access control on files or directories, including the following examples:
  • Per-user home directories.
  • Data engineering (Spark) efforts that require access to cloud storage objects and directories.
  • Data warehouse queries (Hive, Impala, or Iceberg) that use external tables.
  • Access to Ranger's rich access control policies, such as date-based access revocation and user, group, or role-based controls, along with corresponding audits.
  • Tag-based access control using the classification propagation feature that originates from directories.