Apache Ranger authorization
Learn how to integrate an Apache Ranger service running in a Cloudera Private Cloud Base cluster with an Apache Kafka cluster that is deployed using Cloudera Streams Messaging - Kubernetes Operator.
Apache Kafka clusters deployed with Cloudera Streams Messaging - Kubernetes Operator can integrate with Apache Ranger. Ranger is a framework to enable, monitor, and manage comprehensive data security. Specifically, you can use Ranger to authorize access requests made to Kafka. The Ranger service that you integrate with Kafka must run in a Cloudera Private Cloud Base cluster.
To provide authorization for various services, Ranger uses a plugin architecture. Ranger Plugins are lightweight Java plugins developed for specific components and services that run as part of the target component’s JVM process. Ranger plugins pull policies from Ranger, evaluate incoming requests providing authorization, and also capture and push requests as audit events to different audit destinations. To provide authorization for Kafka, Ranger uses the Ranger Kafka plugin. The Ranger Kafka plugin is shipped as part of the Cloudera Runtime parcel.
Integrating your Cloudera Streams Messaging - Kubernetes Operator Kafka clusters with Ranger requires that you complete multiple configuration tasks. The following provides an overview of the process.
- Creating a custom Kafka image that includes the Ranger Kafka plugin
This step involves building a new Kafka image that includes the Ranger Kafka plugin. Your new image will be based on the default Kafka image shipped with Cloudera Streams Messaging - Kubernetes Operator. The Kafka Ranger plugin is extracted from a Cloudera Runtime parcel.
- Configuring Ranger
This step involves creating a user in the Kerberos Key Distribution Center (KDC) of the Cloudera Private Cloud Base cluster as well as various configuration tasks that you complete in the Ranger Admin Web UI.
- Creating Ranger plugin configuration files
This step involves creating various configuration files required for the Ranger Kafka plugin to function. These files store settings such as how to reach Ranger, HDFS, the TLS truststore, authentication credentials, and so on.
- Optional: Configuring Ranger group authorization
Ranger group authorization enables you to add groups to policies in Ranger. Users that are part of a group can be authorized based on the permissions of the group. User group memberships are defined in an LDAP server. This step involves setting up LDAP group mapping properties for the Ranger Kafka plugin.
- Deploying a Ranger-integrated Kafka cluster
This step involves deploying a new Kafka cluster using the custom image you built as well as deploying the various configuration files you created for the Ranger Kafka plugin.
Limitations
Configuring Ranger authorization for Kafka requires that additional persistent storage is attached to each Kafka broker. These volumes must be unique per broker. As a result of how these volumes can be defined in Cloudera Streams Messaging - Kubernetes Operator, you must create a separate KafkaNodePool for each broker. With this limitation, scaling of KafkaNodePools will not work. This is because each KafkaNodePool must be limited to a single replica. If you want to scale your Kafka cluster, you must define another KafkaNodePool for the new broker.
Supported Cloudera Private Cloud Baseversions
Integration with Ranger is only supported for specific Cloudera Private Cloud Base versions. Supported versions of Cloudera Private Cloud Base are as follows.
Version | Ranger Kafka plugin version |
---|---|
7.1.9 (any SP or CHF) | 7.1.9.1015 |
Prerequisites
- Ensure that the Strimzi Cluster Operator is installed and running. See Installation.
- You have a Cloudera Private Cloud Base cluster with the
following.
- The cluster version is supported.
- The cluster must be secure. Both TLS/SSL (channel encryption) and Kerberos (authentication) must be enabled.
- Optional: If you use HDFS or Solr to store Ranger audit data, HDFS or Solr must be installed on the cluster.
- Access to a registry where you can upload a container image is required. The registry must also be accessible by your Kubernetes cluster.
- Java 8 or higher installed on the host which is used to generate the plugin configuration files.
Creating a custom Kafka image that includes the Ranger Kafka plugin
The Ranger Kafka plugin performing the authorization in Kafka’s JVM needs multiple JARS in order to be able to function correctly. As a result, you need to download a Cloudera Runtime parcel, extract the Ranger Kafka plugin, and build a custom Kafka image containing the plugin. The image that you create is used to deploy your Kafka cluster that integrates with Ranger.
Access to docker
or equivalent utility that you can use to build,
pull, and push images is required. The following steps use docker
.
Replace commands where necessary.
Configuring Ranger
To enable an external Kafka cluster to work with Ranger, you must set up various users and policies in Ranger and in your Cloudera Private Cloud Base cluster.
Creating Ranger Kafka plugin configuration files
The Ranger Kafka plugin requires a number of configuration files which store settings such as how to reach Ranger, HDFS, the TLS truststore, authentication credentials, and so on. Some of these files can be saved from the Cloudera Private Cloud Base cluster, some must be created manually.
- ranger_truststore.jks
- Save the truststore of the Ranger Admin server in JKS format.
- krb5.conf
- Save the Kerberos client config from a Cloudera Private Cloud Base cluster host.
- kafka_plugin.keytab
- Save the keytab of the previously created [***PLUGIN PRINCIPAL***] user.
- jaas.conf
- Save the example below and edit the
principal
property.ranger.KafkaServer { com.sun.security.auth.module.Krb5LoginModule required doNotPrompt=true useKeyTab=true storeKey=true keyTab="/mnt/ranger-kafka-plugin/conf_sensitive/kafka_plugin.keytab" principal="[***PLUGIN PRINCIPAL***]@[***KERBEROS REALM***]"; };
- ranger-kafka-security.xml
- Save the following example.
<?xml version="1.0" encoding="UTF-8"?> <configuration> <!-- Required user defined configuration block start --> <property> <name>ranger.plugin.kafka.policy.rest.url</name> <value>[***RANGER REST URL***]</value> </property> <property> <name>ranger.plugin.kafka.service.name</name> <value>[***KAFKA SERVICE NAME***]</value> </property> <property> <name>ranger.plugin.kafka.access.cluster.name</name> <value>[***KAFKA ACCESS CLUSTER NAME***]</value> </property> <!-- Required user defined configuration block end --> <!-- Required hardcoded configuration block start --> <property> <name>ranger.plugin.kafka.policy.cache.dir</name> <value>/mnt/ranger-kafka-plugin/cache</value> </property> <property> <name>ranger.plugin.kafka.policy.source.impl</name> <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> </property> <property> <name>ranger.plugin.kafka.policy.rest.ssl.config.file</name> <value>/opt/kafka/libs/ranger-kafka-plugin-impl/conf/ranger-kafka-policymgr-ssl.xml</value> </property> <property> <name>ranger.plugin.kafka.disable.cache.if.servicenotfound</name> <value>false</value> </property> <!-- Required hardcoded configuration block end --> </configuration>
Substitute the variables in this example as follows.-
[***RANGER REST URL***] ‐ Ranger Admin server base URL. A comma separated list of Ranger Admin base URLs can be provided here if Ranger Admin High Availability is configured. You can also configure a load balancer URL if a load balancer was installed with Ranger Admin High Availability.
-
[***KAFKA SERVICE NAME***] ‐ The Kafka service name as defined earlier in Ranger Admin UI.
-
[***KAFKA ACCESS CLUSTER NAME***] ‐ Cluster name to be displayed in Ranger audit logs. This is an optional config and can be removed. It has no effect other than a field in Ranger audit logs. Cloudera recommends that you set this to [***KAFKA CLUSTER NAME***].[***NAMESPACE***].
-
- ranger-kafka-policymgr-ssl.xml
- Save the following example. This file only contains hardcoded properties.
You do not need to make any
changes.
<?xml version="1.0" encoding="UTF-8"?> <configuration> <!-- Required hardcoded configuration block start --> <property> <name>xasecure.policymgr.clientssl.truststore</name> <value>/opt/kafka/libs/ranger-kafka-plugin-impl/conf/ranger_truststore.jks</value> </property> <property> <name>xasecure.policymgr.clientssl.truststore.credential.file</name> <value>jceks://file/opt/kafka/libs/ranger-kafka-plugin-impl/conf/rangerpluginssl.jceks</value> </property> <!-- Required hardcoded configuration block end --> </configuration>
- ranger-kafka-audit.xml
- Save the following
example.
<?xml version="1.0" encoding="UTF-8"?> <configuration> <!-- Required user defined configuration block start --> <property> <name>xasecure.audit.is.enabled</name> <value>true</value> </property> <!-- HDFS audit block start --> <property> <name>xasecure.audit.destination.hdfs</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.hdfs.dir</name> <value>hdfs://[***ACTIVE HDFS NAMENODE HOST]:[***HDFS NAMENODE PORT***]/[***AUDIT PATH ON HDFS***]</value> </property> <!-- HDFS audit block end --> <!-- Solr audit block start --> <property> <name>xasecure.audit.destination.solr</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.zookeepers</name> <value>[***ZOOKEEPER CONNECTION STRING***]</value> </property> <property> <name>xasecure.audit.destination.solr.collection</name> <value>[***RANGER AUDIT SOLR COLLECTION***]</value> </property> <property> <name>xasecure.audit.jaas.Client.option.principal</name> <value>[***PLUGIN PRINCIPAL***]@[***KERBEROS REALM***]</value> </property> <!-- Solr audit block end --> <!-- Required user defined configuration block end --> <!-- Required hardcoded configuration block start --> <!-- HDFS audit block start --> <property> <name>xasecure.audit.destination.hdfs.batch.filespool.enable</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.hdfs.batch.filespool.dir</name> <value>/mnt/ranger-kafka-plugin/audit/hdfs/spool</value> </property> <!-- HDFS audit block end --> <!-- Solr audit block start --> <property> <name>xasecure.audit.destination.solr.force.use.inmemory.jaas.config</name> <value>true</value> </property> <property> <name>xasecure.audit.jaas.Client.loginModuleName</name> <value>com.sun.security.auth.module.Krb5LoginModule</value> </property> <property> <name>xasecure.audit.jaas.Client.loginModuleControlFlag</name> <value>required</value> </property> <property> <name>xasecure.audit.jaas.Client.option.useKeyTab</name> <value>true</value> </property> <property> <name>xasecure.audit.jaas.Client.option.storeKey</name> <value>false</value> </property> <property> <name>xasecure.audit.jaas.Client.option.serviceName</name> <value>solr</value> </property> <property> <name>xasecure.audit.jaas.Client.option.keyTab</name> <value>/mnt/ranger-kafka-plugin/conf_sensitive/kafka_plugin.keytab</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.enable</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.dir</name> <value>/mnt/ranger-kafka-plugin/audit/solr/spool</value> </property> <!-- Solr audit block end --> <!-- Required hardcoded configuration block end --> <!-- Recommended but not required configuration block start --> <property> <name>xasecure.audit.provider.summary.enabled</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.metrics</name> <value>true</value> </property> <!-- Recommended but not required configuration block end --> </configuration>
Substitute the variables in this example as follows.
-
[***ACTIVE HDFS NAMENODE HOST***] ‐ The hostname of the active HDFS NameNode.
-
[***HDFS NAMENODE PORT***] ‐ The port where the NameNode runs the HDFS protocol. You can find port in
. -
[***AUDIT PATH ON HDFS***] ‐ The HDFS directory where audit logs should be stored.
-
[***ZOOKEEPER CONNECTION STRING***] ‐ Zookeeper connection string as defined by ZooKeeper Sessions. The chroot suffix should be the Znode of the Solr service. You can find the Znode in . For example,
host1:port1,host2:port2/solr-infra
. -
[***RANGER AUDIT SOLR COLLECTION***] ‐ The Solr collection where Ranger audit logs are stored. Optional, defaults to
ranger_audits
if not configured. -
[***PLUGIN PRINCIPAL***] ‐ The primary part of the Ranger Kafka plugin principal.
-
[***KERBEROS REALM***] ‐ The Kerberos realm of the Ranger Kafka plugin principal.
-
- core-site.xml
- Save the following example. This file only contains hardcoded properties.
You do not need to make any
changes.
<?xml version="1.0" encoding="UTF-8"?> <configuration> <!-- Required hardcoded configuration block start --> <property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property> <property> <name>hadoop.rpc.protection</name> <value>privacy</value> </property> <!-- Required hardcoded configuration block end --> </configuration>
- rangerpluginssl.jceks and hadoop-credstore-password
- Get the Ranger Admin server truststore password. Define and export the
following shell variables so that the commands to be run can access them:
-
TRUSTSTORE_PASSWORD: The truststore password obtained above
-
HADOOP_CREDSTORE_PASSWORD: A user defined password which will be used to encrypt the truststore password
java -cp "ranger-kafka-plugin/install/lib/*" org.apache.ranger.credentialapi.buildks create sslTrustStore -value "${TRUSTSTORE_PASSWORD}" -provider "jceks://file/$(pwd)/conf/rangerpluginssl.jceks" -storetype "jceks" echo -n $HADOOP_CREDSTORE_PASSWORD > conf/hadoop-credstore-password
-
Configuring Ranger group authorization
Ranger makes it possible to authorize users based on the group that they are in. If you want to use group authorization, you must create a Secret containing LDAP data and extend your core-site.xml with required properties. This task is optional when configuring Ranger authorization for Kafka.
Ranger group authorization can be useful if you have a number of users but do not want to add them one-by-one to the policies. In this case you might want to utilize group authorization, which enables you to add groups to policies in Ranger.
If a user is a member of a group, it can be authorized based on the permissions of its group. The Ranger Kafka plugin running inside the Kafka broker pod is responsible for determining which groups the user belongs to. When configuring Ranger authorization, you can optionally set up LDAP group mapping, in which case the Ranger Kafka plugin connects to an LDAP server, looks up the end user that made the request to Kafka, and finds the groups that the user is a member of.
This is an optional step when configuring Ranger authorization. If you do not configure group mapping, user permissions can not be determined based on group membership in Ranger, but other authorization methods (such as user and role authorization) will still work as expected. Other group mappings are not supported in Cloudera Streams Messaging - Kubernetes Operator.
The following example demonstrates how you can configure your Kafka cluster to use LDAP group mapping.
-
A running LDAP server is required.
-
The LDAP TLS certificate is available to you.
-
A bind user and password that Kafka can use to connect to the LDAP server for user and group lookup is available to you.
Deploying a Ranger-integrated Kafka cluster
Once your custom image is ready and all configuration files are prepared, you can deploy a Kafka cluster that can integrate with Ranger. To do this, you deploy the Ranger Kafka plugin configuration files together with your Kafka cluster. The configuration files are mounted in the Kafka pods as volumes making them available to Ranger Kafka plugin. The files containing sensitive data are deployed as a Secret. Other files are deployed as a ConfigMap.