Configuring Sentry Policy File Authorization Using Cloudera Manager

Configuring User to Group Mappings

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

Hadoop Groups

  1. Go to the Hive service.
  2. Click the Configuration tab.
  3. Under the Service-Wide category, go to the Sentry section.
  4. Set the Sentry User to Group Mapping Class property to org.apache.sentry.provider.file.HadoopGroupResourceAuthorizationProvider.
  5. Click Save Changes.
  6. Restart the Hive service.

Local Groups

  1. Define local groups in the [users] section of the Policy File. For example:
    [users]
    user1 = group1, group2, group3
    user2 = group2, group3
  2. Modify Sentry configuration as follows:
    1. Go to the Hive service.
    2. Click the Configuration tab.
    3. Under the Service-Wide category, go to the Sentry section.
    4. Set the Sentry User to Group Mapping Class property to org.apache.sentry.provider.file.LocalGroupResourceAuthorizationProvider.
    5. Click Save Changes.
    6. Restart the Hive service.

Enabling URIs for Per-DB Policy Files

The ADD JAR command does not work with HiveServer2 & the Beeline client when Beeline runs on a different host. As an alternative to ADD JAR, Hive's auxiliary paths functionality should be used as described in the following steps. Add the following string to the Java configuration options for HiveServer2 during startup.
-Dsentry.allow.uri.db.policyfile=true

Using User-Defined Functions with HiveServer2

The ADD JAR command does not work with HiveServer2 and the Beeline client when Beeline runs on a different host. As an alternative to ADD JAR, Hive's auxiliary paths functionality should be used as described in the following steps.
  1. Copy the UDF JAR file to HDFS.
  2. Copy the JAR file to the host on which HiveServer2 is running. Save the JARs to any directory you choose, and make a note of the path.
  3. In the Cloudera Manager Admin Console, go to the Hive service.
  4. Click the Configuration tab.
  5. Expand the Service-Wide > Advanced categories
  6. Configure the Hive Auxiliary JARs Directory property with the HiveServer2 host path from the Step 2.
  7. Click Save Changes. The JARs are added to HIVE_AUX_JARS_PATH environment variable.
  8. Redeploy the Hive client configuration.
    1. In the Cloudera Manager Admin Console, go to the Hive service.
    2. From the Actions menu at the top right of the service page, select Deploy Client Configuration.
    3. Click Deploy Client Configuration.
  9. Restart the Hive service. If the Hive Auxiliary JARs Directory property is configured but the directory does not exist, HiveServer2 will not start.
  10. Grant privileges on the JAR files to the roles that require access. You can use the Hive SQL GRANT statement to do so. For example, to grant privileges on the add.jar file:
    GRANT ALL ON URI 'hdfs:///tmp/add.jar' TO ROLE EXAMPLE_ROLE
  11. Run the CREATE FUNCTION command and point to the JAR from Hive. For example:
    CREATE FUNCTION addfunc AS 'com.example.hiveserver2.udf.add' USING JAR 'hdfs:///tmp/add.jar'

Enabling Policy File Authorization for Hive

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

  1. Ensure the Prerequisites have been satisfied.
  2. The Hive warehouse directory (/user/hive/warehouse or any path you specify as hive.metastore.warehouse.dir in your hive-site.xml) must be owned by the Hive user and group.
    • Permissions on the warehouse directory must be set as follows (see following Note for caveats):
      • 771 on the directory itself (for example, /user/hive/warehouse)
      • 771 on all subdirectories (for example, /user/hive/warehouse/mysubdir)
      • All files and subdirectories should be owned by hive:hive
      For example:
      $ sudo -u hdfs hdfs dfs -chmod -R 771 /user/hive/warehouse
      $ sudo -u hdfs hdfs dfs -chown -R hive:hive /user/hive/warehouse
  3. Disable impersonation for HiveServer2:
    1. Go to the Hive service.
    2. Click the Configuration tab.
    3. Under the HiveServer2 role group, uncheck the HiveServer2 Enable Impersonation property, and click Save Changes.
  4. Create the Sentry policy file, sentry-provider.ini , as an HDFS file.
  5. Enable the Hive user to submit MapReduce jobs.
    1. Go to the MapReduce service.
    2. Click the Configuration tab.
    3. Under a TaskTracker role group go to the Security category.
    4. Set the Minimum User ID for Job Submission property to zero (the default is 1000) and click Save Changes.
    5. Repeat steps 5.a-5.d for every TaskTracker role group for the MapReduce service that is associated with Hive, if more than one exists.
    6. Restart the MapReduce service.
  6. Enable the Hive user to submit YARN jobs.
    1. Go to the YARN service.
    2. Click the Configuration tab.
    3. Under a NodeManager role group go to the Security category.
    4. Ensure the Allowed System Users property includes the hive user. If not, add hive and click Save Changes.
    5. Repeat steps 6.a-6.d for every NodeManager role group for the YARN service that is associated with Hive, if more than one exists.
    6. Restart the YARN service.
  7. Go to the Hive service.
  8. Click the Configuration tab.
  9. Under the Service-Wide category, go to the Policy File Based Sentry section.
  10. Check Enable Sentry Authorization Using Policy Files, then click Save Changes.
  11. You must restart the cluster and HiveServer2 after changing these values, whether you use Cloudera Manager or not.

Configuring Group Access to the Hive Metastore

Minimum Required Role: Configurator (also provided by Cluster Administrator, Full Administrator)

You can configure the Hive Metastore to reject connections from users not listed in the Hive group proxy list (in HDFS). If you do not configure this override, the Hive Metastore will use the value in the core-site HDFS configuration. To configure the Hive group proxy list:
  1. Go to the Hive service.
  2. Click the Configuration tab.
  3. Click the Proxy category.
  4. In the Hive Metastore Access Control and Proxy User Groups Override property, specify a list of groups whose users are allowed to access the Hive Metastore. If you do not specify "*" (wildcard), you will be warned if the groups do not include hive and impala (if the Impala service is configured) in the list of groups.
  5. Click Save Changes.
  6. Restart the Hive service.

Enabling Policy File Authorization for Impala

For a cluster managed by Cloudera Manager, perform the following steps to enable policy file authorization for Impala.
  1. Enable Sentry's policy file based authorization for Hive. For details, see Enabling Policy File Authorization for Hive.
  2. Go to the Cloudera Manager Admin Console and navigate to the Impala service.
  3. Click the Configuration tab.
  4. Under the Service-Wide category, go to the Policy File Based Sentry section.
  5. Check Enable Sentry Authorization Using Policy Files, then click Save Changes.
  6. Restart the Impala service.
For more details, see Starting the impalad Daemon with Sentry Authorization Enabled.

Enabling Sentry Authorization for Solr

Minimum Required Role: Full Administrator

  1. Ensure the following requirements are satisfied:
    • Cloudera Search 1.1.1 or higher or CDH 5 or higher.
    • A secure Hadoop cluster.
  2. Create the policy file sentry-provider.ini as an HDFS file. When you create the policy file sentry-provider.ini follow the instructions in the Policy File section in Configuring Sentry for Search (CDH 4) orSearch Authentication. The file must be owned by owned by the solr user in the solr group, with perms=600. By default Cloudera Manager assumes the policy file is in the HDFS location /user/solr/sentry. To configure the location:
    1. Go to the Solr service.
    2. Click the Configuration tab.
    3. Under the Service-Wide category, select Sentry and modify the path in the Sentry Global Policy File property.
    4. Click Save Changes.
  3. Under the Service-Wide category, go to the Policy File Based Sentry section.
  4. Check Enable Sentry Authorization Using Policy Files, then click Save Changes.
  5. Restart the Solr service.

For more details, see Enabling Sentry Authorization for Search.

Configuring Sentry to Enable BDR Replication

Cloudera recommends the following steps when configuring Sentry and data replication is enabled.

  • Group membership should be managed outside of Sentry (as typically OS groups, LDAP groups, and so on are managed) and replication for them also should be handled outside of Cloudera Manager.
  • In Cloudera Manager, set up HDFS replication for the Sentry files of the databases that are being replicated (separately using Hive replication).
  • On the source cluster:
    • Use a separate Sentry policy file for every database
    • Avoid placing any group or role info (except for server admin info) in the global Sentry policy file (to avoid manual replication/merging with the global file on the target cluster)
    • To avoid manual fix up of URI privileges, ensure that the URIs for the data are the same on both the source and target cluster
  • On the target cluster:
    • In the global Sentry policy file, manually add the DB name - DB file mapping entries for the databases being replicated
    • Manually copy the server admin info from the global Sentry policy file on the source to the policy on the target cluster
    • For the databases being replicated, avoid adding more privileges (adding tables specific to target cluster may sometimes require adding extra privileges to allow access to those tables). If any target cluster specific privileges absolutely need to be added for a database, add them to the global Sentry policy file on the target cluster since the per database files would be overwritten periodically with source versions during scheduled replication.