Cloudera Search Backup and Restore Command Reference
Use the following commands to create snapshots, back up, and restore Solr collections.
Create a snapshot
Command:
solrctl collection --create-snapshot <snapshotName> -c
<collectionName>
Description: Creates a named snapshot for the specified collection.
Delete a snapshot
Command:
solrctl collection --delete-snapshot <snapshotName> -c
<collectionName>
Description: Deletes the specified snapshot for the specified collection.
Describe a snapshot
Command:
solrctl collection --describe-snapshot <snapshotName> -c
<collectionName>
Description: Provides detailed information about a snapshot for the specified collection.
List all snapshots
Command:
solrctl collection --list-snapshots
<collectionName>
Description: Lists all snapshots for the specified collection.
Prepare snapshot for export to a remote cluster
Command:
solrctl collection --prepare-snapshot-export <snapshotName>
-c <collectionName> -d <destDir>
Description: Prepares the snapshot for export to a remote cluster. If you are exporting the snapshot to the local cluster, you do not need to run this command. This command generates collection metadata as well as information about the Lucene index files corresponding to the snapshot.
The destination HDFS directory path (specified by the
-d
option) must exist on the local cluster before
you run this command. Make sure that the Solr superuser
(solr
by default) has permission to write to this
directory.
If you are running the snapshot export command on a remote cluster,
specify the HDFS protocol (such as WebHDFS or HFTP) to be used for
accessing the Lucene index files corresponding to the snapshot on the
source cluster. This configuration is driven by the -p
option which expects a fully qualified URI for the root filesystem on
the source cluster, for example
webhdfs://namenode.example.com:20101/
.
Export snapshot to local cluster
Command:
solrctl collection --export-snapshot <snapshotName> -c
<collectionName> -d <destDir>
Description: Creates a backup copy of the Solr collection
metadata as well as the associated Lucene index files at the specified
location. The -d
configuration option specifies the
directory path where this backup copy is be created. This directory
must exist before exporting the snapshot, and the Solr superuser must
be able to write to it.
Export snapshot to remote cluster
Command:
solrctl collection --export-snapshot <snapshotName> -s
<sourceDir> -d <destDir>
Description: Creates a backup copy of the Solr collection snapshot, which includes collection metadata as well as Lucene index files at the specified location. The -d configuration option specifies the directory path where this backup copy is to be created.
Make sure that you prepare the snapshot for export before exporting it to a remote cluster.
You can run this command on either the source or destination
cluster, depending on your environment and the DistCp utility
requirements. If the destination cluster does not have the
solrctl
utility, you must run the command on the
source cluster. The exported snapshot state can then be copied using
standard tools, such as DistCp.
The source and destination directory paths (specified by the
-s
and -d
options, respectively)
must be specified relative to the cluster from which you are running
the command. Directories on the local cluster are formatted as
/path/to/dir
, and directories on the remote cluster
are formatted as
<protocol>://<namenode>:<port>/path/to/dir
.
For example:
- Local path:
/solr-backup/tweets-2016-10-19
- Remote HDFS path:
hdfs://nn01.example.com:8020/solr-backup/tweets-2016-10-19
- Remote WebHDFS path:
webhdfs://nn01.example.com:20101/solr-backup/tweets-2016-10-19
The source directory (specified by the -s
option) is
the directory containing the output of the solrctl collection
--prepare-snapshot-export
command. The destination
directory (specified by the -d
option) must exit on
the destination cluster before running this command.
If your cluster is secured (Kerberos-enabled), initialize your
Kerberos credentials by using kinit
before executing
this command.
Restore from a local snapshot
Command:
solrctl collection --restore
<restoreCollectionName> -l
<backupLocation> -b
<snapshotName> -i
<requestId>
Description: Restores the state of an earlier created backup as a new Solr collection. Run this command on the cluster on which you want to restore the backup.
The -l
configuration option specifies the local HDFS
directory where the backup is stored. If the backup is stored on a
remote cluster, you must copy it to the local cluster before restoring
it. The Solr superuser (solr
by default) must have
permission to read from this directory.
The -b
configuration option specifies the name of
the backup to be restored.
Because the restore operation can take a long time to complete
depending on the size of the exported snapshot, it is run
asynchronously. The -i
configuration parameter
specifies a unique identifier for tracking operation. For more
information, see Check the status of an operation.
The optional -a
configuration option enables the
autoAddReplicas
feature for the new Solr
collection.
The optional -c
configuration option specifies the
configName
for the new Solr collection. If this
option is not specified, the configName
of the
original collection at the time of backup is used. If the specified
configName
does not exist, the restore operation
creates a new configuration from the backup.
The optional -r
configuration option specifies the
replication factor for the new Solr collection. If this option is not
specified, the replication factor of the original collection at the
time of backup is used.
The optional -m
configuration option specifies the
maximum number of replicas (maxShardsPerNode
) to
create on each Solr Server. If this option is not specified, the
maxShardsPerNode
configuration of the original
collection at the time of backup is used.
If your cluster is secured (Kerberos-enabled), initialize your
Kerberos credentials using kinit
before running this
command.
Check the status of an operation
Command:
solrctl collection --request-status
<requestId>
Description: Displays the status of the specified operation. The status can be one of the following:
-
running
: The restore operation is running. -
completed
: The restore operation is complete. -
failed
: The restore operation failed. -
notfound
: The specified requestID is not found.
If your cluster is secured (Kerberos-enabled), initialize your
Kerberos credentials (using kinit
) before running
this command.