Apache Ranger authorization

Learn how to integrate an Apache Ranger service running in a Cloudera Private Cloud Base cluster with an Apache Kafka cluster that is deployed using Cloudera Streams Messaging - Kubernetes Operator.

Apache Kafka clusters deployed with Cloudera Streams Messaging - Kubernetes Operator can integrate with Apache Ranger. Ranger is a framework to enable, monitor, and manage comprehensive data security. Specifically, you can use Ranger to authorize access requests made to Kafka. The Ranger service that you integrate with Kafka must run in a Cloudera Private Cloud Base cluster.

To provide authorization for various services, Ranger uses a plugin architecture. Ranger Plugins are lightweight Java plugins developed for specific components and services that run as part of the target component’s JVM process. Ranger plugins pull policies from Ranger, evaluate incoming requests providing authorization, and also capture and push requests as audit events to different audit destinations. To provide authorization for Kafka, Ranger uses the Ranger Kafka plugin. The Ranger Kafka plugin is shipped as part of the Cloudera Runtime parcel.

Integrating your Cloudera Streams Messaging - Kubernetes Operator Kafka clusters with Ranger requires that you complete multiple configuration tasks. The following provides an overview of the process.

  1. Creating a custom Kafka image that includes the Ranger Kafka plugin

    This step involves building a new Kafka image that includes the Ranger Kafka plugin. Your new image will be based on the default Kafka image shipped with Cloudera Streams Messaging - Kubernetes Operator. The Kafka Ranger plugin is extracted from a Cloudera Runtime parcel.

  2. Configuring Ranger

    This step involves creating a user in the Kerberos Key Distribution Center (KDC) of the Cloudera Private Cloud Base cluster as well as various configuration tasks that you complete in the Ranger Admin Web UI.

  3. Creating Ranger plugin configuration files

    This step involves creating various configuration files required for the Ranger Kafka plugin to function. These files store settings such as how to reach Ranger, HDFS, the TLS truststore, authentication credentials, and so on.

  4. Optional: Configuring Ranger group authorization

    Ranger group authorization enables you to add groups to policies in Ranger. Users that are part of a group can be authorized based on the permissions of the group. User group memberships are defined in an LDAP server. This step involves setting up LDAP group mapping properties for the Ranger Kafka plugin.

  5. Deploying a Ranger-integrated Kafka cluster

    This step involves deploying a new Kafka cluster using the custom image you built as well as deploying the various configuration files you created for the Ranger Kafka plugin.

Limitations

Configuring Ranger authorization for Kafka requires that additional persistent storage is attached to each Kafka broker. These volumes must be unique per broker. As a result of how these volumes can be defined in Cloudera Streams Messaging - Kubernetes Operator, you must create a separate KafkaNodePool for each broker. With this limitation, scaling of KafkaNodePools will not work. This is because each KafkaNodePool must be limited to a single replica. If you want to scale your Kafka cluster, you must define another KafkaNodePool for the new broker.

Supported Cloudera Private Cloud Baseversions

Integration with Ranger is only supported for specific Cloudera Private Cloud Base versions. Supported versions of Cloudera Private Cloud Base are as follows.

Table 1. Supported Cloudera Private Cloud Baseversions for Ranger authorization in Cloudera Streams Messaging - Kubernetes Operator
Version Ranger Kafka plugin version
7.1.9 (any SP or CHF) 7.1.9.1015

Prerequisites

  • Ensure that the Strimzi Cluster Operator is installed and running. See Installation.
  • You have a Cloudera Private Cloud Base cluster with the following.
    • The cluster version is supported.
    • The cluster must be secure. Both TLS/SSL (channel encryption) and Kerberos (authentication) must be enabled.
    • Optional: If you use HDFS or Solr to store Ranger audit data, HDFS or Solr must be installed on the cluster.
    • Access to a registry where you can upload a container image is required. The registry must also be accessible by your Kubernetes cluster.
    • Java 8 or higher installed on the host which is used to generate the plugin configuration files.

Creating a custom Kafka image that includes the Ranger Kafka plugin

The Ranger Kafka plugin performing the authorization in Kafka’s JVM needs multiple JARS in order to be able to function correctly. As a result, you need to download a Cloudera Runtime parcel, extract the Ranger Kafka plugin, and build a custom Kafka image containing the plugin. The image that you create is used to deploy your Kafka cluster that integrates with Ranger.

Access to docker or equivalent utility that you can use to build, pull, and push images is required. The following steps use docker. Replace commands where necessary.

  1. Download a supported version of the Cloudera Runtime RHEL9 parcel.
    You can download the parcel from the Cloudera Archive.
    https://archive.cloudera.com/p/cdh7/[***VERSION***]/parcels/

    Downloading the parcel requires authentication. Use your Cloudera credentials. For more information, see Cloudera Runtime Download Information.

  2. Extract the Ranger Kafka plugin from the downloaded parcel into the ranger-kafka-plugin folder.
    tar xf [***PARCEL FILE***] 'CDH-*/lib/ranger-kafka-plugin/' 'CDH-*/jars'
    cp -RL CDH-*/lib/ranger-kafka-plugin .
    
  3. Verify that the Ranger Kafka plugin symlinks are resolved successfully.
    [[ "$(find ranger-kafka-plugin/ -type l | wc -l)" -eq "0" ]] && echo "Ranger plugin symbolic links resolved successfully." || echo "Error! Symbolic link resolution failed while extracting Ranger plugin."
  4. Create a file named Dockerfile with the following contents.
    FROM container.repository.cloudera.com/cloudera/kafka:0.43.0.1.2.0-b54-kafka-3.8.0.1.2
    USER root
    COPY --chmod=644 ranger-kafka-plugin/lib/ranger-*.jar /opt/kafka/libs/
    COPY --chmod=644 ranger-kafka-plugin/lib/ranger-kafka-plugin-impl /opt/kafka/libs/ranger-kafka-plugin-impl/
    RUN find /opt/kafka/libs/ranger-kafka-plugin-impl/ -type d -exec chmod +x {} \;
    RUN ln -s /mnt/ranger-kafka-plugin/conf /opt/kafka/libs/ranger-kafka-plugin-impl/conf 
    USER kafka
    
  5. Build and tag the image using the Dockerfile you created.
    docker build --tag [***YOUR REGISTRY***]/[***IMAGE NAME***]:[***TAG***] .
  6. Push the image to a registry.
    docker image push [***YOUR REGISTRY***]/[***IMAGE NAME***]:[***TAG***]

    Take note of the registry, image name, and image tag. These are needed later on when deploying your Kafka cluster in Kubernetes.

A custom Kafka image that contains an appropriate version of the Ranger Kafka plugin is available in a registry of your choosing. You can use this image to deploy a Kafka cluster that integrates with Ranger.

Configuring Ranger

To enable an external Kafka cluster to work with Ranger, you must set up various users and policies in Ranger and in your Cloudera Private Cloud Base cluster.

  1. Create a Ranger Kafka plugin user.
    The external Ranger Kafka plugin must be able to authenticate to various Cloudera services. Therefore, a user must exists in the Kerberos Key Distribution Center (KDC) of the Cloudera Private Cloud Base cluster that the Ranger Kafka plugin can use to authenticate itself.

    Create a Kerberos principal and generate a keytab for it if it does not yet exist. These instructions refer to primary part of this principal as [***PLUGIN PRINCIPAL***]. The realm of this principal is referred to as [***KERBEROS REALM***] where needed.

  2. Ensure that the [***PLUGIN PRINCIPAL***] user exists in Ranger.
    Verification is required because users that you create in the KDC might not get synched immediately to Ranger.
  3. Log in to the Ranger Admin Web UI.
  4. Create resource-based service in Ranger for the Kafka cluster.
    Although it is possible to reuse an existing Kafka resource-based service, Cloudera recommends creating a new one. Configure the required properties as follows.
    • Service Name ‐ The name of the resource-based service in Ranger. Arbitrary string, must be defined in the plugin config [***KAFKA SERVICE NAME***] in the ranger-kafka-security.xml. It also appears in the audit log, distinguishing the external Kafka from, for example, the Cloudera Private Cloud Base Kafka if installed.

    • Username ‐ This is a legacy parameter. It is added by default to the policies, but otherwise has no effect.

    • Password ‐ This is a legacy parameter. It has no effect.

    • ZooKeeper Connect String ‐ This is a legacy parameter. It has no effect. Leave as default.

    • Add the following name and value pairs to Add New Configurations.

      • policy.download.auth.users: [***PLUGIN PRINCIPAL***]
      • tag.download.auth.users: [***PLUGIN PRINCIPAL***]
  5. Create users for Strimzi users.
    The Strimzi operators as well as Kafka itself have users which are unknown to the Cloudera Private Cloud Base cluster. [***KAFKA CLUSTER NAME***] is the name of your Kafka cluster running in Kubernetes. The name is an arbitrary string and is specified later on, when creating your Kafka resource.
    1. CN=cluster-operator,O=io.strimzi
    2. CN=[***KAFKA CLUSTER NAME***]-kafka,O=io.strimzi
    3. CN=[***KAFKA CLUSTER NAME***]-entity-user-operator,O=io.strimzi
    4. CN=[***KAFKA CLUSTER NAME***]-entity-topic-operator,O=io.strimzi
    5. CN=[***KAFKA CLUSTER NAME***]-kafka-exporter,O=io.strimzi
    6. CN=[***KAFKA CLUSTER NAME***]-cruise-control,O=io.strimzi
  6. Set up Kafka policies.
    1. Grant each Strimzi user permissions to all Kafka resources and actions in all Kafka policies.
      To do this, go to Service Manager > Resource Policies > [***KAFKA SERVICE NAME***] and add the Strimzi users to the general policies with all action types granted.
    2. Set up permissions for your Kubernetes Kafka cluster’s end users.
      These users might be managed outside of the Cloudera Private Cloud cluster and you might have to define them in Ranger. Add these users to the Kafka policies according to your requirements.
  7. Ensure to configure HDFS policies in Ranger so that the [***PLUGIN PRINCIPAL***] user has read, write, and execute permission to the [***AUDIT PATH***] or one of its parent folders.
    The audit path is defined later as the value of xasecure.audit.destination.hdfs.dir in ranger-kafka-audit.xml, but it is possible that in Ranger you can not choose the specific folder, only its parent folder.
  8. Ensure to configure Solr policies in Ranger so that the [***PLUGIN PRINCIPAL***] user has query and update permissions to the Ranger audit Solr collection.
    The Ranger audit Solr collection is referred to as [***SOLR AUDIT COLLECTION***]. It is defined later as xasecure.audit.destination.solr.collection in ranger-kafka-audit.xml.

Creating Ranger Kafka plugin configuration files

The Ranger Kafka plugin requires a number of configuration files which store settings such as how to reach Ranger, HDFS, the TLS truststore, authentication credentials, and so on. Some of these files can be saved from the Cloudera Private Cloud Base cluster, some must be created manually.

Save or create the following configuration files to a subfolder named conf.
ranger_truststore.jks
Save the truststore of the Ranger Admin server in JKS format.
krb5.conf
Save the Kerberos client config from a Cloudera Private Cloud Base cluster host.
kafka_plugin.keytab
Save the keytab of the previously created [***PLUGIN PRINCIPAL***] user.
jaas.conf
Save the example below and edit the principal property.
ranger.KafkaServer { 
   com.sun.security.auth.module.Krb5LoginModule required
   doNotPrompt=true
   useKeyTab=true
   storeKey=true
   keyTab="/mnt/ranger-kafka-plugin/conf_sensitive/kafka_plugin.keytab"
   principal="[***PLUGIN PRINCIPAL***]@[***KERBEROS REALM***]";
};
ranger-kafka-security.xml
Save the following example.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
  <!-- Required user defined configuration block start -->
  <property>
    <name>ranger.plugin.kafka.policy.rest.url</name>
    <value>[***RANGER REST URL***]</value>
  </property>
  <property>
    <name>ranger.plugin.kafka.service.name</name>
    <value>[***KAFKA SERVICE NAME***]</value>
  </property>
  <property>
    <name>ranger.plugin.kafka.access.cluster.name</name>
    <value>[***KAFKA ACCESS CLUSTER NAME***]</value>
  </property>
  <!-- Required user defined configuration block end -->

  <!-- Required hardcoded configuration block start -->
  <property>
    <name>ranger.plugin.kafka.policy.cache.dir</name>
    <value>/mnt/ranger-kafka-plugin/cache</value>
  </property>
  <property>
    <name>ranger.plugin.kafka.policy.source.impl</name>
    <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value>
  </property>
  <property>
    <name>ranger.plugin.kafka.policy.rest.ssl.config.file</name>
    <value>/opt/kafka/libs/ranger-kafka-plugin-impl/conf/ranger-kafka-policymgr-ssl.xml</value>
  </property>
  <property>
    <name>ranger.plugin.kafka.disable.cache.if.servicenotfound</name>
    <value>false</value>
  </property>
  <!-- Required hardcoded configuration block end -->
</configuration>
Substitute the variables in this example as follows.
  • [***RANGER REST URL***] ‐ Ranger Admin server base URL. A comma separated list of Ranger Admin base URLs can be provided here if Ranger Admin High Availability is configured. You can also configure a load balancer URL if a load balancer was installed with Ranger Admin High Availability.

  • [***KAFKA SERVICE NAME***] ‐ The Kafka service name as defined earlier in Ranger Admin UI.

  • [***KAFKA ACCESS CLUSTER NAME***] ‐ Cluster name to be displayed in Ranger audit logs. This is an optional config and can be removed. It has no effect other than a field in Ranger audit logs. Cloudera recommends that you set this to [***KAFKA CLUSTER NAME***].[***NAMESPACE***].

ranger-kafka-policymgr-ssl.xml
Save the following example. This file only contains hardcoded properties. You do not need to make any changes.
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
  <!-- Required hardcoded configuration block start -->
  <property>
    <name>xasecure.policymgr.clientssl.truststore</name>
     <value>/opt/kafka/libs/ranger-kafka-plugin-impl/conf/ranger_truststore.jks</value>
  </property>
  <property>
    <name>xasecure.policymgr.clientssl.truststore.credential.file</name>
    <value>jceks://file/opt/kafka/libs/ranger-kafka-plugin-impl/conf/rangerpluginssl.jceks</value>
  </property>
  <!-- Required hardcoded configuration block end -->
</configuration>
ranger-kafka-audit.xml
Save the following example.
<?xml version="1.0" encoding="UTF-8"?>

<configuration>
  <!-- Required user defined configuration block start -->
  <property>
    <name>xasecure.audit.is.enabled</name>
    <value>true</value>
  </property>
  <!-- HDFS audit block start -->
  <property>
    <name>xasecure.audit.destination.hdfs</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.dir</name>
    <value>hdfs://[***ACTIVE HDFS NAMENODE HOST]:[***HDFS NAMENODE PORT***]/[***AUDIT PATH ON HDFS***]</value>
  </property>
  <!-- HDFS audit block end -->
  <!-- Solr audit block start -->
  <property>
    <name>xasecure.audit.destination.solr</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.zookeepers</name>
    <value>[***ZOOKEEPER CONNECTION STRING***]</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.collection</name>
    <value>[***RANGER AUDIT SOLR COLLECTION***]</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.option.principal</name>
    <value>[***PLUGIN PRINCIPAL***]@[***KERBEROS REALM***]</value>
  </property>
  <!-- Solr audit block end -->
  <!-- Required user defined configuration block end -->

  <!-- Required hardcoded configuration block start -->
  <!-- HDFS audit block start -->
  <property>
    <name>xasecure.audit.destination.hdfs.batch.filespool.enable</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.hdfs.batch.filespool.dir</name>
    <value>/mnt/ranger-kafka-plugin/audit/hdfs/spool</value>
  </property>
  <!-- HDFS audit block end -->
  <!-- Solr audit block start -->
  <property>
    <name>xasecure.audit.destination.solr.force.use.inmemory.jaas.config</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.loginModuleName</name>
    <value>com.sun.security.auth.module.Krb5LoginModule</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.loginModuleControlFlag</name>
    <value>required</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.option.useKeyTab</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.option.storeKey</name>
    <value>false</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.option.serviceName</name>
    <value>solr</value>
  </property>
  <property>
    <name>xasecure.audit.jaas.Client.option.keyTab</name>
    <value>/mnt/ranger-kafka-plugin/conf_sensitive/kafka_plugin.keytab</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.batch.filespool.enable</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.solr.batch.filespool.dir</name>
    <value>/mnt/ranger-kafka-plugin/audit/solr/spool</value>
  </property>
  <!-- Solr audit block end -->
  <!-- Required hardcoded configuration block end -->

  <!-- Recommended but not required configuration block start -->
  <property>
    <name>xasecure.audit.provider.summary.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>xasecure.audit.destination.metrics</name>
    <value>true</value>
  </property>
  <!-- Recommended but not required configuration block end -->
</configuration>

Substitute the variables in this example as follows.

  • [***ACTIVE HDFS NAMENODE HOST***] ‐ The hostname of the active HDFS NameNode.

  • [***HDFS NAMENODE PORT***] ‐ The port where the NameNode runs the HDFS protocol. You can find port in Cloudera Manager > HDFS > Configuration > NameNode Port.

  • [***AUDIT PATH ON HDFS***] ‐ The HDFS directory where audit logs should be stored.

  • [***ZOOKEEPER CONNECTION STRING***] ‐ Zookeeper connection string as defined by ZooKeeper Sessions. The chroot suffix should be the Znode of the Solr service. You can find the Znode in Cloudera Manager > Solr > Configuration > ZooKeeper Znode. For example, host1:port1,host2:port2/solr-infra.

  • [***RANGER AUDIT SOLR COLLECTION***] ‐ The Solr collection where Ranger audit logs are stored. Optional, defaults to ranger_audits if not configured.

  • [***PLUGIN PRINCIPAL***] ‐ The primary part of the Ranger Kafka plugin principal.

  • [***KERBEROS REALM***] ‐ The Kerberos realm of the Ranger Kafka plugin principal.

core-site.xml
Save the following example. This file only contains hardcoded properties. You do not need to make any changes.
<?xml version="1.0" encoding="UTF-8"?>

<configuration>
  <!-- Required hardcoded configuration block start -->
  <property>
    <name>hadoop.security.authentication</name>
    <value>kerberos</value>
  </property>
  <property>
    <name>hadoop.rpc.protection</name>
    <value>privacy</value>
  </property>
  <!-- Required hardcoded configuration block end -->
</configuration>
rangerpluginssl.jceks and hadoop-credstore-password
Get the Ranger Admin server truststore password. Define and export the following shell variables so that the commands to be run can access them:
  • TRUSTSTORE_PASSWORD: The truststore password obtained above

  • HADOOP_CREDSTORE_PASSWORD: A user defined password which will be used to encrypt the truststore password

java -cp "ranger-kafka-plugin/install/lib/*" org.apache.ranger.credentialapi.buildks create sslTrustStore -value "${TRUSTSTORE_PASSWORD}" -provider "jceks://file/$(pwd)/conf/rangerpluginssl.jceks" -storetype "jceks"
echo -n $HADOOP_CREDSTORE_PASSWORD > conf/hadoop-credstore-password

Configuring Ranger group authorization

Ranger makes it possible to authorize users based on the group that they are in. If you want to use group authorization, you must create a Secret containing LDAP data and extend your core-site.xml with required properties. This task is optional when configuring Ranger authorization for Kafka.

Ranger group authorization can be useful if you have a number of users but do not want to add them one-by-one to the policies. In this case you might want to utilize group authorization, which enables you to add groups to policies in Ranger.

If a user is a member of a group, it can be authorized based on the permissions of its group. The Ranger Kafka plugin running inside the Kafka broker pod is responsible for determining which groups the user belongs to. When configuring Ranger authorization, you can optionally set up LDAP group mapping, in which case the Ranger Kafka plugin connects to an LDAP server, looks up the end user that made the request to Kafka, and finds the groups that the user is a member of.

This is an optional step when configuring Ranger authorization. If you do not configure group mapping, user permissions can not be determined based on group membership in Ranger, but other authorization methods (such as user and role authorization) will still work as expected. Other group mappings are not supported in Cloudera Streams Messaging - Kubernetes Operator.

The following example demonstrates how you can configure your Kafka cluster to use LDAP group mapping.

  • A running LDAP server is required.

  • The LDAP TLS certificate is available to you.

  • A bind user and password that Kafka can use to connect to the LDAP server for user and group lookup is available to you.

  1. Create additional LDAP resources in Kubernetes.
    Connection to LDAP requires some additional information (including sensitive data), such as a bind user password, a truststore, and a truststore password. You must create a Kubernetes Secret that contains this data. You mount this Secret to Kafka broker pods in a later step. Mounting the Secret makes the data available to the Kafka Ranger plugin. The plugin uses this data to connect to the LDAP server.
    1. Create the following three files containing LDAP data.
      • [***LDAP TRUSTSTORE FILE***] ‐ Truststore file containing the LDAP server certificate in JKS format.

      • [***LDAP TRUSTSTORE PASSWORD FILE***] ‐ File containing the password of the LDAP truststore.

      • [***LDAP BIND USER PASSWORD FILE***] ‐ File containing the password of the bind user, which is used to connect to the LDAP server.

    2. Create a Secret using the files you created.
      kubectl create secret generic [***LDAP CONFIG SECRET***] \
        --namespace [***NAMESPACE***] \
        --from-file=[***LDAP TRUSTSTORE FILE***] \
        --from-file=[***LDAP TRUSTSTORE PASSWORD FILE***] \
        --from-file=[***LDAP BIND USER PASSWORD FILE***]
  2. Configure core-site.xml.
    The Ranger plugin searches for LDAP group lookup specific configuration in the core-site.xml file. Extend your core-site.xml file with LDAP related properties.
    <property>
        <name>hadoop.security.group.mapping</name>
        <value>org.apache.hadoop.security.LdapGroupsMapping</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.url</name>
        <value>[***LDAP URL***]</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.bind.user</name>
        <value>[***LDAP BIND USER***]</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.bind.password.file</name>
        <value>/mnt/ldap/[***LDAP BIND USER PASSWORD FILE***]</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.ssl</name>
        <value>true</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.ssl.truststore</name>
        <value>/mnt/ldap/[***LDAP TRUSTSTORE FILE***]</value>
      </property>
      <property>
        <name>hadoop.security.group.mapping.ldap.ssl.truststore.password.file</name>
        <value>/mnt/ldap/[***LDAP TRUSTSTORE PASSWORD FILE***]</value>
      </property>
    

Deploying a Ranger-integrated Kafka cluster

Once your custom image is ready and all configuration files are prepared, you can deploy a Kafka cluster that can integrate with Ranger. To do this, you deploy the Ranger Kafka plugin configuration files together with your Kafka cluster. The configuration files are mounted in the Kafka pods as volumes making them available to Ranger Kafka plugin. The files containing sensitive data are deployed as a Secret. Other files are deployed as a ConfigMap.

  1. Create a ConfigMap using the Ranger Kafka plugin configuration files that do not include any sensitive data.
    kubectl create configmap [***KAFKA CLUSTER NAME***]-ranger-plugin-config \
      --namespace [***NAMESPACE***] \
      --from-file=conf/core-site.xml \
      --from-file=conf/jaas.conf \
      --from-file=conf/krb5.conf \
      --from-file=conf/ranger-kafka-audit.xml \
      --from-file=conf/ranger-kafka-policymgr-ssl.xml \
      --from-file=conf/ranger-kafka-security.xml \
      --from-file=conf/ranger_truststore.jks \
      --from-file=conf/rangerpluginssl.jceks
    
  2. Create a Secret using the Ranger Kafka plugin configuration files that include sensitive data.
    kubectl create secret generic [***KAFKA CLUSTER NAME***]-ranger-plugin-config-sensitive \
      --namespace [***NAMESPACE***] \
      --from-file=conf/hadoop-credstore-password \
      --from-file=conf/kafka_plugin.keytab
  3. Create a YAML configuration containing your PersistentVoulemClaim and KafkaNodePool resources.
    In order to be able to function correctly in case of temporary connectivity issues, the Ranger Kafka plugin needs persistent storage. Each Kafka broker requires three distinct PersistentVolumeClaims (PVC) for this purpose. The name of the PVCs should match the mounted PVC names in KafkaNodePool resources.
    The required size of these volumes depend on your environment.
    • The policy cache size depends on how many Ranger policies, roles, and tags you defined.
    • The audit file spool serves the purpose of storing the audit logs locally if the Ranger Kafka plugin can not reach the audit destination.
    • The file spool volume size depends on how many audit logs are generated and how long you want to retain audit logs in the Kafka pod in case of connectivity issues to the audit target.

    The following snippet contains the necessary Kubernetes resource configurations (PVC and KafkaNodePool) for a single broker. If you have more than one broker, define a similar configuration for each one.

    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-policy-cache
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: [***REQUIRED SIZE***]
        limits:
          storage: [***MAXIMUM SIZE***]
    ---
    
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-hdfs-filespool
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: [***REQUIRED SIZE***]
        limits:
          storage: [***MAXIMUM SIZE***]
    ---
    
    apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      name: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-solr-filespool
    spec:
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: [**REQUIRED SIZE***]
        limits:
          storage: [**MAXIMUM SIZE***]
    ---
    
    apiVersion: kafka.strimzi.io/v1beta2
    kind: KafkaNodePool
    metadata:
      name: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]
      labels:
        strimzi.io/cluster: [***KAFKA CLUSTER NAME***]
    spec:
      replicas: 1
      roles:
        - broker
    ...
      template:
        pod:
          volumes:
            - name: ranger-plugin-conf
              configMap: 
                name: [***KAFKA CLUSTER NAME***]-ranger-plugin-config
            - name: ranger-plugin-sensitive
              secret:
                secretName: [***KAFKA CLUSTER NAME***]-ranger-plugin-config-sensitive
            - name: ranger-plugin-policy-cache
              persistentVolumeClaim:
                claimName: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-policy-cache
            - name: ranger-plugin-hdfs-filespool
              persistentVolumeClaim:
                claimName: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-hdfs-filespool
            - name: ranger-plugin-solr-filespool
              persistentVolumeClaim:
                claimName: [***KAFKA CLUSTER NAME***]-[***BROKER ID***]-solr-filespool
        kafkaContainer:
          env:
            - name: HADOOP_CREDSTORE_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: [***KAFKA CLUSTER NAME***]-ranger-plugin-config-sensitive
                  key: hadoop-credstore-password
          volumeMounts:
            - name: ranger-plugin-conf
              mountPath: /mnt/ranger-kafka-plugin/conf
            - name: ranger-plugin-sensitive
              mountPath: /mnt/ranger-kafka-plugin/conf_sensitive
            - name: ranger-plugin-policy-cache
              mountPath: /mnt/ranger-kafka-plugin/cache
            - name: ranger-plugin-hdfs-filespool
              mountPath: /mnt/ranger-kafka-plugin/audit/hdfs/spool
            - name: ranger-plugin-solr-filespool
              mountPath: /mnt/ranger-kafka-plugin/audit/solr/spool
    
    • [***KAFKA CLUSTER NAME***] ‐ Kafka cluster name.

    • [***BROKER ID***] ‐ A unique broker ID within the same Kafka cluster.

  4. Optional: If you configured Ranger group authorization, specify additional properties in your KafkaNodePool resources.
    If you completed configuration for Ranger group authorization, you must mount [***LDAP CONFIG SECRET***] as a volume and set the KAFKA_OPTS environment variable in your KafkaNodePool resources.

    Mounting [***LDAP CONFIG SECRET***] mounts the Secret in the Kafka pods and makes sensitive LDAP configuration available for the Ranger Kafka plugin. Setting the KAFKA_OPTS environment variable is required so that the group mapping functionality in the Ranger plugin works correctly with recent Java versions.

    #...
    kind: KafkaNodePool
    spec:
      template:
        pod:
          volumes:
            - name: ldap-config
              secret:
                secretName: [***LDAP CONFIG SECRET***]
        kafkaContainer:
          env:
            - name: KAFKA_OPTS
              value: "--add-exports java.naming/com.sun.jndi.ldap=ALL-UNNAMED"
          volumeMounts:
            - name: ldap-config
              mountPath: /mnt/ldap
  5. Deploy your PersistentVolumeClaim and KafkaNodePool resources.
    This creates the volumes and the KafkaNodepools that will be used by the Kafka cluster you deploy in a later step. If you created more than a single configuration file, ensure that you deploy all of them.
    kubectl apply --file [***YAML CONFIG***] --namespace [***NAMESPACE***]
  6. Ensure that the status of each PVC is bound
    kubectl get pvc --namespace [***NAMESPACE***]
  7. Create a YAML configuration that contains your Kafka resource.
    #...
    kind: Kafka
    metadata:
      name: [***KAFKA CLUSTER NAME***]
      annotations:
        strimzi.io/node-pools: enabled
    spec:
      kafka:
        image: [***YOUR REGISTRY***]/[***IMAGE***]:[***TAG***]
        jvmOptions:
          javaSystemProperties:
            - name: java.security.krb5.conf
              value: /opt/kafka/libs/ranger-kafka-plugin-impl/conf/krb5.conf
            - name: java.security.auth.login.config
              value: /opt/kafka/libs/ranger-kafka-plugin-impl/conf/jaas.conf
        authorization:
          type: custom
          authorizerClass: org.apache.ranger.authorization.kafka.authorizer.RangerKafkaAuthorizer
        config:
          ranger.jaas.context: ranger
    
    • [***KAFKA CLUSTER NAME***] ‐ The name of the Kafka should be the one which was used during the service creation step in Ranger setup.

    • [***IMAGE REGISTRY***]/[***IMAGE NAME***]:[***TAG***] ‐The location where you pushed the previously built Docker image, which contains Kafka and Ranger artifacts.

  8. Deploy the Kafka resource.
    kubectl apply --file [***YAML CONFIG***] --namespace [***NAMESPACE***]