Sentry to Ranger replication for Hive external tables

When you create or edit a Hive external table replication policy in Cloudera Private Cloud Base Replication Manager, you can choose to migrate the Sentry policies for Hive objects, Impala data, and URLs that are being replicated. Replication Manager converts the Sentry policies to Ranger policies for the migrated data in the target cluster.

To migrate Sentry policies to Ranger policies using Hive external table replication policies, you must have installed Cloudera Manager version 6.3.1 and higher on the source cluster and Cloudera Manager version 7.1.1 and higher on the target cluster.

Replication Manager performs the following tasks automatically during the replication job run to migrate Sentry policies in the source cluster to Ranger policies in the target cluster:
  1. Exports each Sentry policy as a single JSON file using the authzmigrator tool. The JSON file contains a list of resources, such as URI, database, table, or column and the policies that apply to it.
  2. Copies the exported Sentry policies to the target cluster using the DistCp tool.
  3. Ingests the Sentry policies into Ranger after filtering the policies related to the replication job using the authzmigrator tool through the Ranger REST API endpoint. To filter the policies, the Replication Manager uses a filter expression that is passed to the authzmigrator tool by Cloudera Manager.

Modify properties on Sentry-Ranger Migration tab

Starting from Cloudera Manager version 7.7.1 CHF18 and higher and 7.11.3 CHF5 and higher, you can modify the properties in the authorization-migration-site.xml file during the Hive external table replication creation or edit process on the Sentry-Ranger Migration tab.

The Sentry-Ranger Migration tab appears in the Hive external table replication policy wizard after you choose the If Sentry permissions were exported from the CDH cluster, import both Hive object and URL permissions or If Sentry permissions were exported from the CDH cluster, import only Hive object permissions option in the General > Permissions field.

You can add one or more key-value arguments to either add a new property or override an existing property in the authorization-migration-site.xml file. You can add:

  • key-value pairs for the properties to use during the Sentry export process on the source cluster in the Sentry export authorization-migration-site.xml extra properties field.
  • key-value pairs for the properties to use during the Ranger import process on the target cluster in the Ranger import authorization-migration-site.xml extra properties field.

For example, if you want to use the URL prefix as specified in the authorization.migration.destination.location.prefix parameter in the authorization-migration-site.xml file, skip the Sentry policies with Owner privileges from the migration process, and also inform Replication Manager that the Sentry and Ranger policies have role-based permissions, you must perform the following steps:

  1. Create or edit a Hive external table replication policy on the Cloudera Manager > Replication > Replication Policies page.
  2. Choose the If Sentry permissions were exported from the CDH cluster, import both Hive object and URL permissions or If Sentry permissions were exported from the CDH cluster, import only Hive object permissions option in the General > Permissions field.
  3. Enter the following key-value pairs in the Sentry-Ranger Migration > Sentry export authorization-migration-site.xml extra properties field:
    • authorization.migration.role.permissions = true

      When set to true, the parameter informs Replication Manager that the Sentry policies use roleBasedPermissions and that it must use the same during the Sentry export process.

    • authorization.migration.skip.owner.policy = true

      When set to true, Replication Manager skips the Sentry policies with Owner privileges during migration.

  4. Enter the following key-value pairs in the Sentry-Ranger Migration > Ranger import authorization-migration-site.xml extra properties field:
    • authorization.migration.destination.location.prefix = [***DESTINATION LOCATION PREFIX***]

      Enter the required destination location prefix depending on your requirements. For example, if you are migrating Sentry policies from a CDH source cluster to a target Cloudera Private Cloud Base cluster, the prefix must match the CDP cluster’s namespace. In this instance, if the rootPath parameter is hdfs://[***CDP NAMESERVICE***], then you must enter authorization.migration.destination.location.prefix=hdfs://[***CDP NAMESERVICE***]

    • authorization.migration.url.ignore.scheme = ([***ENTER COMMA-SEPARATED PREFIXES TO USE DURING MIGRATION. FOR EXAMPLE, S3, FILE ***])

      The authorization.migration.url.ignore.scheme property is dependent on two other properties, that is authorization.migration.translate.url.privileges and authorization.migration.destination.location.prefix in the authorization-migration-site.xml file.

      If the authorization-migration-site.xml file contains authorization.migration.translate.url.privileges = true, authorization.migration.destination.location.prefix = hdfs://ns1, and the authorization.migration.url.ignore.scheme property is not set, all the URL policies’ prefixes are replaced with hdfs://ns1 after the import process is complete. However, if a file:///opt/somevalue URL is available, then the URL becomes hdfs://ns1/opt/somevalue after the import process.

      If you set the config authorization.migration.url.ignore.scheme = s3,file parameter in the Sentry-Ranger Migration tab, then the above URL is skipped from updating as its prefix starts with file. Therefore, the URL file:///opt/somevalue remains as is after the import process.

    • authorization.migration.role.permissions = true

      When set to true, the parameter informs Replication Manager that the Ranger policies must use roleBasedPermissions and to use role-based permissions during the Sentry import process.