Authorization

CDP utilizes Apache Ranger to implement a common authorization policy framework across services and tracks user activity within those services. Services across a cluster interface with Ranger using a plugin framework. When a service uses Ranger policies, the service loads a plugin module that manages the authorization decisions, local service auditing, and syncing policy updates from the Ranger Admin interface.

Ranger Admin UI and Ranger Service plugins

For clusters that have multiple users and production, or production-like availability requirements, you must enable Ranger high availability. The two components that must be configured in each running service are the Ranger Admin UI and the Ranger plugins. Ranger high availability can be implemented with or without a load balancer. A load balancer can be utilized to provide a common accessible Ranger Admin UI host name for operators and administrators. When Ranger high availability is enabled without a load balancer, the Ranger plugins are configured to be aware of all Ranger Admin instances.

In the case of one or more Ranger Admin instances going offline, Ranger plugins keep a local cache of downloaded policy so the authorization decisions continue. The Ranger plugin automatically tracks changes to these policies and downloads updates when the Ranger Admin instances return to service.

For information about Ranger high availability, see Configuring Apache Ranger High Availability.

Ranger policy synchronization

In a complex disaster recovery environment, authorization policies must be synchronized across both clusters. This ensures that the workloads seamlessly move across disaster recovery cluster pairs in the event of a failure. Policy synchronization is currently implemented through export and import operations with the Ranger Admin API, or through external management of policy configurations that are pushed into the Ranger Admin service for each disaster recovery cluster pair.

Because HDFS high availability requires unique Name Service identifiers for all clusters, care must be taken when crafting Ranger policies to account for embedded HDFS service names. In particular, HDFS policies and Hadoop SQL policies can utilize HDFS path URIs, where Name Service identifiers can be found. When exporting and importing policies across clusters, review and convert policy exports are required to account for these name differences.