Backing up a collection from local file system
Back up Solr collections to a shared file system to minimize data loss caused by accidental or malicious administrative actions. Learn how to create backup of a collection from local file system (FS).
If you use local FS to store backups, each Solr host stores its backup directory locally, that is, server X contains the backup directory snapshots.shard1, server Y contains snapshots.shard2 and you need to copy those to a shared location in order to be able to restore them later. Because of this, Cloudera recommends to target backups to a shared file system, even if your Solr collection uses local FS.
jaas.conf
file by adding the following parameter
to each
command:--jaas [***/PATH/TO/JAAS.CONF***]
- Optional:
Create a snapshot. On a host running Solr
Server, run the following command:
solrctl collection --create-snapshot [***USER_DEFINED_NAME_OF_THE_SNAPSHOT***] -c [***NAME_OF_THE_COLLECTION_TO_BE_BACKED_UP***]
This step is optional. You can back up this snapshot by specifying the [***USER_DEFINED_NAME_OF_THE_SNAPSHOT***] as the value ofcommitName
parameter. If you do not create and specify a snapshot, the backup exports the index state corresponding to the current latest finished commit.For example, to create a snapshot for a collection named
tweets
:solrctl collection --create-snapshot tweets-$(date +%Y%m%d%H%M) -c tweets Successfully created snapshot with name tweets-202103281043 for collection tweets
-
Create the backup. The destination directory must exist and be writable by the
Solr superuser (
solr
by default).- To back up a snapshot, use the following
command:
wherecurl -k --negotiate -u : 'http://[***HOST***]:8983/solr/admin/collections?action=BACKUP&name=[***BACKUP_NAME***]&commitName=[***SNAPSHOT_NAME***]&collection=[***COLLECTION_NAME***]&location=***BACKUP_LOCATION***'
- [***HOST***]
- is a host name or IP address valid in your environment.
location=***BACKUP_LOCATION***
- specifies the directory (for example,
/tmp
) of the backup target defined insolr.xml
where the backup is to be stored. If you have defined a HDFS target backup repository, the backup is stored on HDFS at ***BACKUP_LOCATION*** name=[***BACKUP_NAME***]
- specifies the name of the backup - the backup is created in the subdirectory [***BACKUP_NAME***] of the backup repository directory ***BACKUP_LOCATION***.
commitName=[***SNAPSHOT_NAME***]
- is the name of the snapshot you want to back up
collection=[***COLLECTION_NAME***]
- specifies the collection that you want to back up.
For example:
The example URL targets one (any one) of the Solr servers and creates a backup of the entire collection.curl -k --negotiate -u : 'http://host1.example.com:8983/solr/admin/collections?action=BACKUP&name=mybackup&commitName=tweets-202103281043&collection=tweets&location=/tmp'
- To back up the current state of the
index:
curl -k --negotiate -u : 'http://[***HOST***]:8983/solr/admin/collections?action=BACKUP&name=[***BACKUP_NAME***]&collection=tweets&location=/tmp'
- [***HOST***]
- is a host name or IP address valid in your environment.
location=***BACKUP_LOCATION***
- specifies the directory (for example,
/tmp
) of the backup target defined insolr.xml
where the backup is to be stored. If you have defined a HDFS target backup repository, the backup is stored on HDFS at ***BACKUP_LOCATION*** name=[***BACKUP_NAME***]
- specifies the name of the backup - the backup is created in the subdirectory [***BACKUP_NAME***] of the backup repository directory ***BACKUP_LOCATION***.
collection=[***COLLECTION_NAME***]
- specifies the collection that you want to back up.
For example:
The example URL targets one (any one) of the Solr servers and creates a backup of the entire collection.curl -k --negotiate -u : 'http://host1.example.com:8983/solr/admin/collections?action=BACKUP&name=mybackup&collection=tweets&location=/tmp'
<?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">0</int><int name="QTime">3636</int></lst> </response>
After completing a backup, the data is stored in the standard backup format:
- To back up a snapshot, use the following
command:
-
To check the backup files, run the following command:
hdfs dfs -ls /tmp/mybackup
Found 4 items -rw-rw-rw- 2 solr supergroup 181 2021-01-13 21:33 /tmp/mybackup/backup.properties drwxrwxrwx - solr supergroup 0 2021-01-13 21:33 /tmp/mybackup/snapshot.shard1 drwxrwxrwx - solr supergroup 0 2021-01-13 21:33 /tmp/mybackup/snapshot.shard2 drwxrwxrwx - solr supergroup 0 2021-01-13 21:33 /tmp/mybackup/zk_backup
-
Delete the snapshot after exporting:
solrctl collection --delete-snapshot [***NAME_OF_THE_SNAPSHOT_TO_BE_DELETED***] -c [***COLLECTION_NAME***]
For example:
solrctl collection --delete-snapshot tweets-202103281043 -c tweets