Configuring Ranger RMS (Hive-HDFS / Hive-OZONE ACL Sync)

Ranger Resource Mapping Server (RMS) should be fully configured after installation. This topic provides further information about RMS configuration settings and workflows.

Important configuration information - Hive Ozone

  • Ranger RMS enables OZONE/HDFS access via Ranger Hive policies. Ranger RMS must be configured with the names of OZONE, HDFS and Hive services (AKA Repos). In your installation, there may be multiple Ranger services created for OZONE, HDFS and Hive. These can be seen from the Ranger Admin web UI. RMS ACL sync is designed to work on a specific pair of HDFS<->Hive and OZONE<->Hive Ranger services. Therefore, it is important to identify those service names before Ranger RMS is installed. These names should be configured during the installation of Ranger RMS. The default value for Ranger HDFS service name is cm_hdfs, for Ranger OZONE service name is cm_ozone and for the Ranger Hive service the default name is cm_hive.

  • Before starting the Ranger RMS installation, ensure that the Hive service identified in the installation above allows the rangerrms user select access to all tables in all databases in default (no-zone), as well as in all security-zones for the Hive service.

  • In case of custom kerberos principal/user, ensure that the Hive service identified in the installation above allows the rangerrmsfoo0 (custom) user select access to all tables in all databases in default (no-zone), as well as in all security-zones for the Hive service.

  • By default, Ranger RMS tracks only external tables in Hive. To configure Ranger RMS to also track managed Hive tables, add the following configuration setting to Ranger RMS.

    ranger-rms.HMS.map.managed.tables=true
  • To enable RMS for HDFS authorization, go to Cloudera Manager, select HDFS > Configuration > HDFS Service Advanced Configuration Snippet (Safety Valve) for ranger-hdfs-security.xml, then confirm the following settings:

    ranger.plugin.hdfs.chained.services = cm_hive
    ranger.plugin.hdfs.chained.services.cm_hive.impl = org.apache.ranger.chainedplugin.hdfs.hive.RangerHdfsHiveChainedPlugin
  • To enable RMS for OZONE authorization, go to Cloudera Manager, select OZONE > Configuration > OZONE Manager Advanced Configuration Snippet (Safety Valve) for ozone-conf/ranger-ozone-security.xml, then use +Add to add the following properites:

    ranger.plugin.ozone.chained.services = cm_hive
    ranger.plugin.ozone.chained.services.cm_hive.impl = org.apache.ranger.chainedplugin.ozone.hive.RangerOzoneHiveChainedPlugin

Advanced configurations

HDFS plugin side configurations

  • ranger.plugin.hdfs.mapping.hive.authorize.with.only.chained.policies
    • true: Enforce strict Sentry semantics.
    • false: If there is no applicable Hive policy, let HDFS determine access.
    • Default setting: true
  • ranger.plugin.hdfs.accesstype.mapping.read
    • A comma-separated list of hive access types that HDFS "read" maps to.
    • Default setting: select
  • ranger.plugin.hdfs.accesstype.mapping.write
    • A comma-separated list of hive access types that HDFS "write" maps to.
    • Default setting: update,alter
  • ranger.plugin.hdfs.accesstype.mapping.execute
    • A comma-separated list of hive access types that HDFS "execute" maps to.
    • Default setting: _any
  • ranger.plugin.hdfs.mapping.source.download.interval
    • The time in milliseconds between mappings download requests from the HDFS Ranger plugin to RMS.

    • Default setting: 30 seconds

      By default, the RMS plugin checks for new mapping downloads every 30 seconds, based on this configuration. If you have mapping data (found in the hdfs_cm_hive_resource_mapping.json file) of approximately 360MB file size; then performing this operation every 30 seconds could cause an excessive load on the NameNode. After enabling performance logs, we can observe that saveToCache takes 11 seconds and loadFromCache operations take 7 seconds to complete. The cacheing process takes approximately 18~19 seconds to complete, as shown in the following example performance logs:

      DEBUG org.apache.ranger.perf.resourcemapping.init: [PERF] RangerMappingRefresher.loadFromCache(serviceName=cm_hive): 7449
      DEBUG org.apache.ranger.perf.resourcemapping.init: [PERF] RangerMappingRefresher.saveToCache(serviceName=cm_hive): 11787

      In this case, you should adjust the frequency of download RMS mappings to at least 18*2= 36 seconds. A more conservative value = 45 seconds. In this way, you can tune RMS configurations to optimize performance in the NameNode plugin.

Hive service configuration

  • ranger.plugin.audit.excluder.users
    • This configuration, added in the Hive service-configs, lists the users whose access to Hive or Hive Metastore does not generate audit records. There may be a large number of audit records created when "rangerrms" makes requests to the Hive Metastore when downloading Hive table data. By adding the "rangerrms" user to the comma-separated list of users in this configuration, such audit records will not be generated.

OZONE plugin side configurations

  • ranger.plugin.ozone.mapping.source.download.interval
    • The time in milliseconds between mappings download requests from the OZONE Ranger plugin to RMS.
    • Default setting: 30 seconds
  • ranger.plugin.ozone.privileged.user.names
    • Default setting : admin,dpprofiler,hue,beacon,hive,impala
  • ranger.plugin.ozone.service.names
    • Default setting : hive,impala
OZONE to HIVE access type mapping for Table and Database
Ozone ACL HIVE ACL For Table HIVE ACL For Database
read select _any
write update update
create create create
list select select
delete drop drop
read_acl select select
write_acl update update
  • ranger.plugin.ozone.accesstype.mapping.<ozone_acl>
    • This config is used to map Ozone access types to Hive Table access types. Modify the config <ozone_acl> as per the ozone access mentioned “OZONE to HIVE access type mapping for Table and Database”
    • Please refer to the “HIVE ACL for Table” column for default configuration with respect to Ozone ACL.
    • For example :

      ranger.plugin.ozone.accesstype.mapping.read = select

  • ranger.plugin.ozone.db.accesstype.mapping.<ozone_acl>
    • This config is used to map Ozone access types to Hive Table access types. Modify the config <ozone_acl> as per the ozone access mentioned “OZONE to HIVE access type mapping for Table and Database”
    • Please refer to the “HIVE ACL for Database” column for default configuration with respect to Ozone ACL.
    • For example :

      ranger.plugin.ozone.db.accesstype.mapping.read = _any

RMS side configurations

  • ranger-rms.HMS.source.service.name
    • The Ranger HDFS service name (default: cm_hdfs).
  • ranger-rms.HMS.target.service.name
    • The Ranger Hive service name (default: cm_hive).
  • ranger-rms.HMS.map.managed.tables
    • true – Track managed and external tables.
    • false – Track only external tables.
    • Default setting: false
  • ranger-rms.polling.notifications.frequency.ms
    • The time in milliseconds between polls from RMS to HMS for changes to tables.
    • Default setting: 30 seconds
  • ranger-rms.HMS.source.service.name.ozone
    • The Ranger OZONE service name
    • Default setting : cm_ozone
  • ranger-rms.supported.uri.scheme
    • A comma-separated list of uri schemes supported by RMS
    • Default setting : hdfs,ofs,o3fs