Extracting HDFS native permissions

Learn how to use the HDFS Permissions Export utility to extract HDFS native POSIX permissions from a source cluster running CDH or HDP. The extracted HDFS permissions are then used to create Ranger S3 policies that can be used in Cloudera.

The HDFS Permissions Export utility exports the permissions to a .csv file in the following format with the permissions sorted in ascending order:
Syntax:
"/resource/path"|"username"|"groupname"|"permission"
where,
  • "/resource/path" refers to resource entities
  • "username" refers to user entries
  • "groupname" refers to group entries
  • "permission" refers to the HDFS permission entries
Example:
"/dir1"|"hdfs"|"supergroup, public"|"read, execute"
"/dir1/dir1"||"supergroup, public"|"read, execute"
"/dir1/dir1"|"hdfs"||"read, write, execute"
  • The source and target migration clusters must be running the following supported versions:
    • Source cluster: HDP 2.6.5, HDP 3.1.5, CDH 5.16, CDH 6.3
    • Target cluster: CDP 7.2.15 or higher
  • Secure Copy (SCP) the HDFS keytab file from the source cluster to your local system
  • Run the klist command to view the list of Kerberos principals available in the keytab — ${RANGER_ADMIN_KEYTAB_PATH} and then run the kinit command to authenticate the user.
    klist -kt ${RANGER_ADMIN_KEYTAB_PATH}
    
    kinit -kt ${RANGER_ADMIN_KEYTAB_PATH} rangeradmin/_HOST@REALM
  1. Download the HDFSPermissionsExportTool-<version>.jar file from Cloudera Archive.
  2. Run the following command to extract HDFS permissions from the source cluster.
    hadoop jar ./<path>/HDFSPermissionsExportTool-<version>.jar hdfs_namenode_host:hdfs_port source_directories
    where,
    • <path> refers to the location of the downloaded .jar file.
    • hdfs_namenode_host:hdfs_port refers to the URI of the active NameNode and the HDFS NameNode port. For example, active-namenode.example.com:8020
    • source_directories refers to the root directory from where you want the permissions scan to begin. For example, "/" or "/dir1". You can also specify multiple directories — "/hbase/yarn"
    hadoop jar ./target/HDFSPermissionsExportTool-<version>.jar active-namenode.example.com:8020 /user /ranger
            /hbase

    This creates a HDFS_Permissions_Export_<timestamp>.csv file that contains the required HDFS permissions.

  3. Use the scp command and copy HDFS_Permissions_Export_<timestamp>.csv from the source cluster to the migration node of the target cluster.
Convert the extracted HDFS native permissions into Ranger HDFS policies and then transform them into Ranger S3 policies.