Backing Up and Restoring Cloudera Search

CDH 5.9 and higher include a backup/restore mechanism primarily designed to provide disaster recovery capability for Apache Solr. You can create a backup of a Solr collection and restore from this backup if the index is corrupted due to a software bug, or if an administrator accidentally or maliciously deletes a collection or a subset of documents. This procedure can also be used as part of a cluster migration (for example, if you are migrating to a cloud environment), or to recover from a failed upgrade.

At a high level, the steps to back up a Solr collection are as follows:

  1. Create a snapshot.
  2. If you are exporting the snapshot to a remote cluster, prepare the snapshot for export.
  3. Export the snapshot to either the local cluster or a remote one. This step uses the Hadoop DistCP utility.

The backup operation uses the native Solr snapshot capability to capture a point-in-time, consistent state of index data for a specified Solr collection. You can then use the Hadoop DistCp utility to copy the index files and the associated metadata for that snapshot to a specified location in HDFS or a cloud object store (for example, Amazon S3).

The Solr snapshot mechanism is based on the Apache Lucene IndexDeletionPolicy abstraction, which enables applications such as Cloudera Search to manage the lifecycle of specific index commits. A Solr snapshot assigns a user-specified name to the latest hard-committed state. After the snapshot is created, the Lucene index files associated with the commit are retained until the snapshot is explicitly deleted. The index files associated with the snapshot are preserved regardless of document updates and deletions, segment merges during index optimization, and so on.

Creating a snapshot does not take much time because it only preserves the snapshot metadata and does not copy the associated index files. A snapshot does have some storage overhead corresponding to the size of the index because the index files are retained indefinitely until the snapshot is deleted.

Solr snapshots can help you recover from some scenarios, such as accidental or malicious data modification or deletion. They are insufficient to address others, such as index corruption and accidental or malicious administrative action (for example, deleting a collection, changing collection configuration, and so on). To address these situations, export snapshots regularly and before performing non-trivial administrative operations such as changing the schema, splitting shards, or deleting replicas.

Exporting a snapshot exports the collection metadata as well as the corresponding Lucene index files. This operation internally uses the Hadoop DistCp utility to copy the Lucene index files and metadata, which creates a full backup at the specified location. After the backup is created, the original Solr snapshot can be safely deleted if necessary.

Prerequisites

Before you begin backing up Cloudera Search, make sure that solr.xml contains following configuration section:

  <backup>
    <repository name="hdfs"     class="org.apache.solr.core.backup.repository.HdfsBackupRepository" default="true">
      <str name="solr.hdfs.home">${solr.hdfs.home:}</str>
      <str name="solr.hdfs.confdir">${solr.hdfs.confdir:}</str>
      <str name="solr.hdfs.security.kerberos.enabled">${solr.hdfs.security.kerberos.enabled:false}</str>
      <str name="solr.hdfs.security.kerberos.keytabfile">${solr.hdfs.security.kerberos.keytabfile:}</str>
      <str name="solr.hdfs.security.kerberos.principal">${solr.hdfs.security.kerberos.principal:}</str>
      <str name="solr.hdfs.permissions.umask-mode">${solr.hdfs.permissions.umask-mode:000}</str>
    </repository>
  </backup>

New installations of CDH 5.9 and higher include this section. For upgrades, you must manually add it and restart the Solr service.

To edit the solr.xml file:

  1. Download solr.xml from ZooKeeper:
    solrctl cluster --get-solrxml $HOME/solr.xml
  2. Edit the file locally and add the <backup> section at the end of the solr.xml file before the closing </solr> tag.
  3. Upload the modified file to ZooKeeper:
    solrctl cluster --put-solrxml $HOME/solr.xml

Sentry Configuration

As part of backup/restore mechanism, the following APIs are introduced. The following table shows the required Sentry authorization privileges for each action. For more information on configuring Sentry privileges, see Configuring Sentry Authorization for Cloudera Search.

Solr Snapshot Sentry Permissions
Required Privileges Action
admin=collections->action=UPDATE
collection=<collectionName>->action=UPDATE
CREATESNAPSHOT
DELETESNAPSHOT
RESTORE
admin=collections->action=QUERY
collection=<collectionName>->action=QUERY
LISTSNAPSHOTS
BACKUP

Backing Up a Solr Collection

Use the following procedure to back up a Solr collection. For more information on the commands used, see Cloudera Search Backup and Restore Command Reference.

If you are using a secure (Kerberos-enabled) cluster, specify your jaas.conf file by adding the following parameter to each command:

--jaas /path/to/jaas.conf

If TLS is enabled for the Solr service, specify the truststore and password using the ZKCLI_JVM_FLAGS environment variable before you begin the procedure:

export ZKCLI_JVM_FLAGS="-Djavax.net.ssl.trustStore=/path/to/truststore \
-Djavax.net.ssl.trustStorePassword=trustStorePassword"
  1. Create a snapshot. On a host running Solr Server, run the following command:
    solrctl collection --create-snapshot <snapshotName> -c <collectionName>

    For example, to create a snapshot for a collection named tweets:

    solrctl collection --create-snapshot tweets-$(date +%Y%m%d%H%M) -c tweets
    Successfully created snapshot with name tweets-201803281043 for collection tweets
  2. If you are backing up the Solr collection to a remote cluster, prepare the snapshot for export. If you are backing up the Solr collection to the local cluster, skip this step.
    solrctl collection --prepare-snapshot-export <snapshotName> -c <collectionName> -d <destDir>

    The destination HDFS directory path (specified by the -d option) must exist on the local cluster before you run this command. Make sure that the Solr superuser (solr by default) has permission to write to this directory.

    For example:

    hdfs dfs -mkdir -p /path/to/backup-staging/tweets-201803281043
    hdfs dfs -chown :solr /path/to/backup-staging/tweets-201803281043
    solrctl collection --prepare-snapshot-export tweets-201803281043 -c tweets \
    -d /path/to/backup-staging/tweets-201803281043
  3. Export the snapshot. This step uses the DistCp utility to back up the collection metadata as well as the corresponding index files. The destination directory must exist and be writable by the Solr superuser (solr by default). To export the snapshot to a remote cluster, run the following command:
    solrctl collection --export-snapshot <snapshotName> -s <sourceDir> -d <protocol>://<namenode>:<port>/<destDir>

    For example:

    • HDFS protocol:
      solrctl collection --export-snapshot tweets-201803281043 -s /path/to/backup-staging/tweets-201803281043 \
      -d hdfs://nn01.example.com:8020/path/to/backups
    • WebHDFS protocol:
      solrctl collection --export-snapshot tweets-201803281043 -s /path/to/backup-staging/tweets-201803281043 \
      -d webhdfs://nn01.example.com:20101/path/to/backups

    To export the snapshot to the local cluster, run the following command:

    solrctl collection --export-snapshot <snapshotName> -c <collectionName> -d <destDir>

    For example:

    solrctl collection --export-snapshot tweets-201803281043 -c tweets -d /path/to/backups/
  4. Delete the snapshot:
    solrctl collection --delete-snapshot <snapshotName> -c <collectionName>

    For example:

    solrctl collection --delete-snapshot tweets-201803281043 -c tweets

Restoring a Solr Collection

Use the following procedure to restore a Solr collection. For more information on the commands used, see Cloudera Search Backup and Restore Command Reference.

If you are using a secure (Kerberos-enabled) cluster, specify your jaas.conf file by adding the following parameter to each command:

-jaas /path/to/jaas.conf

If TLS is enabled for the Solr service, specify the truststore and password by using the ZKCLI_JVM_FLAGS environment variable before you begin the procedure:

export ZKCLI_JVM_FLAGS="-Djavax.net.ssl.trustStore=/path/to/truststore \
-Djavax.net.ssl.trustStorePassword=trustStorePassword"
  1. If you are restoring from a backup stored on a remote cluster, copy the backup from the remote cluster to the local cluster. If you are restoring from a local backup, skip this step.

    Run the following commands on the cluster to which you want to restore the collection:

    hdfs dfs -mkdir -p /path/to/restore-staging
    hadoop distcp <protocol>://<namenode>:<port>/path/to/backup /path/to/restore-staging

    For example:

    • HDFS protocol:
      hadoop distcp hdfs://nn01.example.com:8020/path/to/backups/tweets-201803281043 /path/to/restore-staging
    • WebHDFS protocol:
      hadoop distcp webhdfs://nn01.example.com:20101/path/to/backups/tweets-201803281043 /path/to/restore-staging
  2. Start the restore procedure. Run the following command:
    solrctl collection --restore <restoreCollectionName> -l <backupLocation> -b <snapshotName> -i <requestId>

    Make sure that you use a unique <requestID> each time you run this command. For example:

    solrctl collection --restore tweets -l /path/to/restore-staging -b tweets-201803281043 -i restore-tweets
  3. Monitor the status of the restore operation. Run the following command periodically:
    solrctl collection --request-status <requestId>

    Look for <str name="state"> in the output. For example (emphasis added):

    solrctl collection --request-status restore-tweets
     <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"> <int name="status"> 0</int> <int name="QTime"> 1</int> </lst> \
    <lst name="status"> <str name="state"> completed</str> <str name="msg"> found restore-tweets in completed tasks</str> </lst> </response>

    The state parameter can be one of the following:

    • running: The restore operation is running.
    • completed: The restore operation is complete.
    • failed: The restore operation failed.
    • notfound: The specified <requestID> does not exist.

Cloudera Search Backup and Restore Command Reference

Use the following commands to create snapshots, back up, and restore Solr collections. If you are using a secure (Kerberos-enabled) cluster, specify your jaas.conf file by adding the following parameter to the command:

-jaas /path/to/jaas.conf

If TLS is enabled for the Solr service, specify the truststore and password by using the ZKCLI_JVM_FLAGS environment variable:

export ZKCLI_JVM_FLAGS="-Djavax.net.ssl.trustStore=/path/to/truststore \
-Djavax.net.ssl.trustStorePassword=trustStorePassword"

Create a snapshot

Command: solrctl collection --create-snapshot <snapshotName> -c <collectionName>

Description: Creates a named snapshot for the specified collection.

Delete a snapshot

Command: solrctl collection --delete-snapshot <snapshotName> -c <collectionName>

Description: Deletes the specified snapshot for the specified collection.

Describe a snapshot

Command: solrctl collection --describe-snapshot <snapshotName> -c <collectionName>

Description: Provides detailed information about a snapshot for the specified collection.

List all snapshots

Command: solrctl collection --list-snapshots <collectionName>

Description: Lists all snapshots for the specified collection.

Prepare snapshot for export to a remote cluster

Command: solrctl collection --prepare-snapshot-export <snapshotName> -c <collectionName> -d <destDir>

Description: Prepares the snapshot for export to a remote cluster. If you are exporting the snapshot to the local cluster, you do not need to run this command. This command generates collection metadata as well as information about the Lucene index files corresponding to the snapshot.

The destination HDFS directory path (specified by the -d option) must exist on the local cluster before you run this command. Make sure that the Solr superuser (solr by default) has permission to write to this directory.

If you are running the snapshot export command on a remote cluster, specify the HDFS protocol (such as WebHDFS or HFTP) to be used for accessing the Lucene index files corresponding to the snapshot on the source cluster. For more information about the Hadoop DistCP utility, see Copying Cluster Data Using DistCp. This configuration is driven by the -p option which expects a fully qualified URI for the root filesystem on the source cluster, for example webhdfs://namenode.example.com:20101/.

Export snapshot to local cluster

Command: solrctl collection --export-snapshot <snapshotName> -c <collectionName> -d <destDir>

Description: Creates a backup copy of the Solr collection metadata as well as the associated Lucene index files at the specified location. The -d configuration option specifies the directory path where this backup copy is be created. This directory must exist before exporting the snapshot, and the Solr superuser must be able to write to it.

Export snapshot to remote cluster

Command: solrctl collection --export-snapshot <snapshotName> -s <sourceDir> -d <destDir>

Description: Creates a backup copy of the Solr collection snapshot, which includes collection metadata as well as Lucene index files at the specified location. The -d configuration option specifies the directory path where this backup copy is be created.

Make sure that you prepare the snapshot for export before exporting it to a remote cluster.

You can run this command on either the source or destination cluster, depending on your environment and the DistCp utility requirements. For more information, see Copying Cluster Data Using DistCp. If the destination cluster does not have the solrctl utility, you must run the command on the source cluster. The exported snapshot state can then be copied using standard tools, such as DistCp.

The source and destination directory paths (specified by the -s and -d options, respectively) must be specified relative to the cluster from which you are running the command. Directories on the local cluster are formatted as /path/to/dir, and directories on the remote cluster are formatted as <protocol>://<namenode>:<port>/path/to/dir. For example:

  • Local path: /solr-backup/tweets-2016-10-19
  • Remote HDFS path: hdfs://nn01.example.com:8020/solr-backup/tweets-2016-10-19
  • Remote WebHDFS path: webhdfs://nn01.example.com:20101/solr-backup/tweets-2016-10-19

The source directory (specified by the -s option) is the directory containing the output of the solrctl collection --prepare-snapshot-export command. The destination directory (specified by the -d option) must exit on the destination cluster before running this command.

If your cluster is secured (Kerberos-enabled), initialize your Kerberos credentials by using kinit before executing this command.

Restore from a local snapshot

Command: solrctl collection --restore <restoreCollectionName> -l <backupLocation> -b <snapshotName> -i <requestId>

Description: Restores the state of an earlier created backup as a new Solr collection. Run this command on the cluster on which you want to restore the backup.

The -l configuration option specifies the local HDFS directory where the backup is stored. If the backup is stored on a remote cluster, you must copy it to the local cluster before restoring it. The Solr superuser (solr by default) must have permission to read from this directory.

The -b configuration option specifies the name of the backup to be restored.

Because the restore operation can take a long time to complete depending on the size of the exported snapshot, it is run asynchronously. The -i configuration parameter specifies a unique identifier for tracking operation. For more information, see Check the status of an operation.

The optional -a configuration option enables the autoAddReplicas feature for the new Solr collection.

The optional -c configuration option specifies the configName for the new Solr collection. If this option is not specified, the configName of the original collection at the time of backup is used. If the specified configName does not exist, the restore operation creates a new configuration from the backup.

The optional -r configuration option specifies the replication factor for the new Solr collection. If this option is not specified, the replication factor of the original collection at the time of backup is used.

The optional -m configuration option specifies the maximum number of replicas (maxShardsPerNode) to create on each Solr Server. If this option is not specified, the maxShardsPerNode configuration of the original collection at the time of backup is used.

If your cluster is secured (Kerberos-enabled), initialize your Kerberos credentials using kinit before running this command.

Check the status of an operation

Command: solrctl collection --request-status <requestId>

Description: Displays the status of the specified operation. The status can be one of the following:

  • running: The restore operation is running.
  • completed: The restore operation is complete.
  • failed: The restore operation failed.
  • notfound: The specified requestID is not found.

If your cluster is secured (Kerberos-enabled), initialize your Kerberos credentials (using kinit) before running this command.