Enabling Sentry Service Authorization
This topic describes how to enable the Sentry service with Cloudera Director.
Prerequisites
- Cloudera Director 1.1.x
- CDH 5.1.x (or higher) managed by Cloudera Manager 5.1.x (or higher).
- Kerberos authentication implemented on your cluster.
Setting Up the Sentry Service Using the Cloudera Director CLI
This method requires you to send configuration files that the Cloudera Director server can use to deploy clusters. See Submitting a Cluster Configuration File for more details. Make sure you add SENTRY to the array of services to be launched. This is specified in the configuration file as:
services: [HDFS, YARN, ZOOKEEPER, HIVE, OOZIE, HUE, IMPALA, SENTRY]
To specify a database, use the databases setting as follows:
cluster { ... databases { SENTRY: { type: mysql host: sentry.db.example.com port: 3306 user: <database_username> password: <database_password> name: <database_name> } } }
The Sentry service also requires the following custom configuration for the MapReduce, YARN, HDFS, Hive, and Impala Services.
- MapReduce: Set the Minimum User ID for Job Submission property to zero (the default is 1000) for every
TaskTracker role group that is associated with Hive.
MAPREDUCE { TASKTRACKER { taskcontroller_min_user_id: 0 } }
- YARN: Ensure that the Allowed System Users property, for every NodeManager role group that is
associated with Hive, includes the hive user.
YARN { NODEMANAGER { container_executor_allowed_system_users: hive, impala, hue } }
- HDFS: Enable HDFS extended ACLs.
HDFS { dfs_permissions: true dfs_namenode_acls_enabled: true }
With Cloudera Manager 5.3 and CDH 5.3, you can enable synchronization of HDFS and Sentry permissions for HDFS files that are part of Hive tables. For details on enabling this feature using Cloudera Manager, see Synchronizing HDFS ACLs and Sentry Permissions. - Hive: Make sure Sentry policy file authorization has been disabled for Hive.
HIVE { sentry_enabled: false }
- Impala: Make sure Sentry policy file authorization has been disabled for Impala.
IMPALA { sentry_enabled: false }
Set Permissions on the Hive Warehouse
Once setup is complete, configure the following permissions on the Hive warehouse. For Sentry authorization to work correctly, the Hive warehouse directory (/user/hive/warehouse or any path you specify as hive.metastore.warehouse.dir in your hive-site.xml) must be owned by the Hive user and group.- Permissions on the warehouse directory must be set as follows:
- 771 on the directory itself (for example, /user/hive/warehouse)
- 771 on all subdirectories (for example, /user/hive/warehouse/mysubdir)
- All files and subdirectories must be owned by hive:hive
$ sudo -u hdfs hdfs dfs -chmod -R 771 /user/hive/warehouse $ sudo -u hdfs hdfs dfs -chown -R hive:hive /user/hive/warehouse
Setting up the Sentry Service Using the Cloudera Director API
You can use the Cloudera Director API to set up Sentry. Define the ClusterTemplate to include Sentry as a service, along with the configurations specified above, but in JSON format.
Set permissions on the Hive warehouse as described above.
Related Links
For detailed instructions on adding and configuring the Sentry service, see Installing and Upgrading the Sentry Service and Configuring the Sentry Service.
Examples on using Grant/Revoke statements to enforce permissions using Sentry are available at Hive SQL Syntax.