Setting up incremental backups and restores for Solr collections

Know how to set up incremental backsups and restores for the Solr collections.

Solr facilitates incremental backup and restoration processes, enabling the optimization of storage utilization and the reduction of data transmission latencies. In contrast to conventional snapshots generated through the solrctl utility, which produce non-incremental datasets, the Solr Collections REST API monitors data variations to ensure that only modifications subsequent to the preceding backup are replicated.

You must verify that your target environment has a valid storage directory configured on a supported file system or cloud storage protocol before executing backup or restore API operations.

  1. Initiate the incremental backup process by executing the action=BACKUP command using the Solr Collections API.
    curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=BACKUP&name=<backup_name>&collection=<collection_name>&location=</path/to/backup/into>&async=<unique_command_id_for_tracking>"
  2. Monitor the status of the asynchronous backup command by using the tracking ID.
    curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=REQUESTSTATUS&requestid=<unique_command_id_for_tracking>"
  3. Verify that the response includes a completed state to confirm that the files are safely stored in your target destination.
    "status":{
      "state":"completed",
      "msg":"found [<unique_command_id_for_tracking>] in completed tasks"
    }
    
  4. Restore data into a collection from your saved checkpoints by executing the action=RESTORE command.
    curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=RESTORE&name=<backup_name>&collection=<collection_to_restore_into>&location=</path/of/the/backup>&async=<unique_command_id_for_tracking>"

    By default, this command restores the backup version that contains the highest identification number.

  5. Optional: Restore a specific historical backup version instead of the latest variant by performing the following substeps:
    1. Retrieve the list of all available historical backup versions within your target directory.
      curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=LISTBACKUP&name=<backup_name>&location=</path/of/the/backup>"
    2. Append the backupId parameter to your restore command to target a specific version, such as version 0.
      curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=RESTORE&name=<backup_name>&collection=<collection_to_restore_into>&location=</path/of/the/backup>&backupId=0&async=<unique_command_id_for_tracking>"
  6. Monitor the progress of your asynchronous restore operation by tracking your request identification number.
    curl -k --negotiate -u : "https://<hostname>:8995/solr/admin/collections?action=REQUESTSTATUS&requestid=<unique_command_id_for_tracking>"
The Solr cluster populates the target collection with data from the specified backup snapshot once the request status transitions to a completed state.