Restoring a collection

You can restore a Solr collection from a backup stored on either a remote cluster or the local cluster using the solrctl utility. You must pass a unique request identifier as part of the restore command in the solrctl utility while initiating the restore operation for tracking the process.

If you are using a secure (Kerberos-enabled) cluster, specify your jaas.conf file by adding the following parameter to each command:
-jaas [***/PATH/TO/JAAS.CONF***]
If TLS is enabled for the Solr service, specify the truststore and password by using the ZKCLI_JVM_FLAGS environment variable before you begin the procedure:
  1. If you are restoring from a backup stored on a remote cluster, copy the backup from the remote cluster to the local cluster. If you are restoring from a local backup, skip this step.

    Run the following commands on the cluster to which you want to restore the collection:

    hdfs dfs -mkdir -p [***PATH/TO/RESTORE/STAGING***]hadoop distcp [***PROTOCOL***]://[***NAMENODE***]:[***PORT***]/[***PATH/TO/BACKUP***] [***/PATH/TO/RESTORE-STAGING***]

    For example:

    HDFS protocol:
    hadoop distcp hdfs:// /path/to/restore-staging
    WebHDFS protocol:
    hadoop distcp webhdfs:// /path/to/restore-staging
  2. Start the restore procedure. Run the following command:
    solrctl collection --restore [***NAME_OF_THE_RESTORED_COLLECTION***] -l [***BACKUP_LOCATION***] -b [***NAME_OF_THE_SNAPSHOT_TO_BE_RESTORED***] -i [***REQUEST_ID***]

    Make sure that you use a unique [***REQUEST_ID***] each time you run this command.

    For example:

    solrctl collection --restore tweets -l /path/to/restore-staging -b tweets-202103281043 -i restore-tweets
  3. Monitor the status of the restore operation. Run the following command periodically:
    solrctl collection --request-status [***REQUEST_ID***]

    Look for <str name="state"> in the output. For example (emphasis added):

    solrctl collection --request-status restore-tweets
     <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status"> 0</int> <int name="QTime"> 1</int> </lst> \
    <lst name="status"> <str name="state"> completed</str> <str name="msg"> found restore-tweets in completed tasks</str> </lst> </response>

    The state parameter can be one of the following:

    • running: The restore operation is running.
    • completed: The restore operation is complete.
    • failed: The restore operation failed.
    • notfound: The specified [***REQUEST_ID***] does not exist.