Backing up ranger_audits Solr collection

Perform a critical pre-upgrade step for Cloudera 7.3.2.0 or higher releases, focusing on the Ranger Audits Solr collection. This includes the process of backing up the existing ranger_audits collection and schema.

When upgrading to Cloudera 7.3.2.0 and higher releases that update the Ranger Audits Solr schema, Cloudera Manager runs an internal service command to apply the schema changes safely.

Although the command is non-destructive, Cloudera strongly recommends creating a backup of the Ranger Audits Solr collection before starting the upgrade. This ensures a quick restoration of audit data if an error occurs during the upgrade.

  • The Solr service must be functional and in a running state.
  • You must have access to the Solr host, for example, https://<solr-server-host>:<solr-server-port>, where the ranger_audits collection is located.
  • You must enable Kerberos authentication, and use the appropriate rangeradmin principal from the ranger.keytab file.
    ps -ef | grep rangeradmin | awk '{split($0, array,"-cp"); print array[2]}' | cut -d: -f1
    #Note down the path for RANGER_ADMIN_CONF_DIR
    
    klist -kt <RANGER_ADMIN_CONF_DIR>/../ranger.keytab
    #Note down RANGER_ADMIN_PRINCIPAL, it must starts with 'rangeradmin' by default
    
    kinit -kt <RANGER_ADMIN_CONF_DIR>/../ranger.keytab <RANGER_ADMIN_PRINCIPAL>
    #No output will be returned on successful authentication
  • The Ranger user associated with the rangeradmin principal must have adequate permissions defined in the Ranger Solr policy. This is required if the Ranger plugin is enabled, as it allows the execution of Solr REST API operations for tasks such as collection backups and schema updates.
  • You must confirm that the Ranger plugin auditing in Solr is working.
  • You must have a backup location, with enough space, to store the exported schema JSON file and logs.
  1. SSH to the cluster node and authenticate by using the rangeradmin principal.
    kinit -kt <RANGER_ADMIN_CONF_DIR>/../ranger.keytab <RANGER_ADMIN_PRINCIPAL>
  2. Export the current schema of the ranger_audits collection to the local file system.
    curl -k --negotiate -u : "${SOLR_HTTP_SCHEME}://${SOLR_HOST}:${SOLR_PORT}/solr/ranger_audits/schema" > ~/ranger_audits_schema_backup_$(date +%F).json
  3. Verify the backup schema file. Optionally, open the JSON to check whether it contains field definitions and types.
    ls -lh ~/ranger_audits_schema_backup_$(date +%F).json
  4. Perform a full backup of the ranger_audits collection and associated configurations stored in Solr.
    curl -k --negotiate -u :  "${SOLR_HTTP_SCHEME}://${SOLR_HOST}:${SOLR_PORT}/solr/admin/collections" \
      --get \
      --data-urlencode "action=BACKUP" \
      --data-urlencode "collection=${COLLECTION}" \
      --data-urlencode "name=${BACKUP_NAME}" \
      --data-urlencode "location=${LOCATION}"
  5. Store the backup in a safe, version-controlled location, for example, a shared drive or backup server, and record the filename and date.
  6. Optional: Track the backup progress or status.
    curl -k --negotiate -u :  "${SOLR_HTTP_SCHEME}://${SOLR_HOST}:${SOLR_PORT}/solr/admin/collections" \
      --get \
      --data-urlencode "action=BACKUPSTATUS" \
      --data-urlencode "requestid=${BACKUP_NAME}"
You must verify schema updates after upgrading to Cloudera 7.3.2.0 or higher releases. For more information, see Ranger audit Solr collection - Verifying schema updates.

Optional. If you encounter any failure during the upgrade, you must restore the Ranger Audits Solr Schema and collection to their previous state. For more information, see Restore Ranger audit Solr schema and collection.