Nested topic test

This is about collection management, but with nested topics. Maybe not so DITA. Maybe not a good idea. OR maybe it is. Who knows. Who's to decide? Also, rellinks.

Generate collection configuration using configs

You must create a collection configuration prior to creating a Solr collection. The configuration files are created in ZooKeeper based on existing templates using the ConfigSets API. Learn how create one using configs.

Configs are named configuration sets that you can reference when creating collections.

You can manage configuration objects directly using the solrctl config command, which is a wrapper script for the Solr ConfigSets API.

solrctl config --create <name> <baseConfig> [-p <name>=<value>]
  1. If you are using Kerberos, kinit as a user with permission to create the collection configuration:
    kinit solradmin@EXAMPLE.COM

    Replace EXAMPLE.COM with your Kerberos realm name.

  2. To generate configuration files for a collection, run the following command:
    solrctl config --create <configName> <baseConfig> -p immutable=false
    where
    <configName>
    is the user-specified name of the config
    <baseConfig>
    is the name of an existing config template
    To list all available config templates, use the solrctl instancedir --list command.
    -p <name>=<value>
    Overrides a <baseConfig> setting. The only config property that you can override is immutable, so the possible options are -p immutable=true and -p immutable=false. If you are copying an immutable config, such as a template, use -p immutable=false to make sure that you can edit the new config.
    For example, to create the configuration logs_config based on managedTemplate:
    solrctl config --create logs_config managedTemplate -p immutable=false

Create a Solr collection

Learn how to create a collection so that you can start indexing data with Solr.

  • If you have enabled Ranger for authorization, you must have Solr Admin permission to be able to create collections.
  • Before you can create a Solr collection you need to generate a collection configuration using either a config or an instance directory.
  1. If you are using Kerberos, kinit as a user with permission to create the collection:
    kinit solradmin@EXAMPLE.COM

    Replace EXAMPLE.COM with your Kerberos realm name.

  2. On a host running a Solr server, make sure that the SOLR_ZK_ENSEMBLE environment variable is set in /etc/solr/conf/solr-env.sh. For example:
    cat /etc/solr/conf/solr-env.sh
    export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr

    This is automatically set on hosts with a Solr Server or Gateway role in Cloudera Manager.

  3. Create a new collection using the following command:
    solrctl collection --create <collectionName> -s <numShards> -c <collectionConfName>

    where

    <collectionName> User-defined name of the collection.
    <numShards> The number of shards you want to split your collection into.
    <collectionConfName> The name of an existing collection configuration.
    For example:
    solrctl collection --create logs -s 3 -c logs_config

Backing up a collection from HDFS

You can back up Solr collections to your local cluster or a remote cluster using the solrctl utility to minimize data loss caused by accidental or malicious administrative actions. Learn how to create, prepare, and export the Solr collection snapshot to create a backup of the Solr collection.

If you are using a secure (Kerberos-enabled) cluster, specify your jaas.conf file by adding the following parameter to each command:
--jaas [***/PATH/TO/JAAS.CONF***]

If TLS is enabled for the Solr service, specify the truststore and password using the ZKCLI_JVM_FLAGS environment variable before you begin the procedure:

export ZKCLI_JVM_FLAGS="-Djavax.net.ssl.trustStore=[***/PATH/TO/TRUSTSTORE \
-Djavax.net.ssl.trustStorePassword=[***TRUST_STORE_PASSWORD***]"
  1. Create a snapshot. On a host running Solr Server, run the following command:
    solrctl collection --create-snapshot [***USER_DEFINED_NAME_OF_THE_SNAPSHOT***] -c [***NAME_OF_THE_COLLECTION_TO_BE_BACKED_UP***]

    For example, to create a snapshot for a collection named tweets:

    solrctl collection --create-snapshot tweets-$(date +%Y%m%d%H%M) -c tweets
    Successfully created snapshot with name tweets-202103281043 for collection tweets
  2. If you are backing up the Solr collection to a remote cluster, prepare the snapshot for export. If you are backing up the Solr collection to the local cluster, skip this step.
    solrctl collection --prepare-snapshot-export [***NAME_OF_THE_SNAPSHOT_TO_BE_EXPORTED***] -c [***COLLECTION_NAME***] -d [***DESTINATION_DIRECTORY***]

    The destination HDFS directory path ([***DESTINATION_DIRECTORY***], specified by the -d option) must exist on the local cluster before you run this command. Make sure that the Solr superuser (solr by default) has permission to write to this directory.

    For example:

    hdfs dfs -mkdir -p /path/to/backup-staging/tweets-202103281043
    hdfs dfs -chown :solr /path/to/backup-staging/tweets-202103281043
    solrctl collection --prepare-snapshot-export tweets-202103281043 -c tweets \
    -d /path/to/backup-staging/tweets-202103281043
  3. Export the snapshot. This step uses the DistCp utility to back up the collection metadata as well as the corresponding index files. The destination directory must exist and be writable by the Solr superuser (solr by default).
    To export the snapshot to a remote cluster, run the following command:
    solrctl collection --export-snapshot [***NAME_OF_THE_SNAPSHOT_TO_BE_EXPORTED***] -s [***SOURCE_DIRECTORY***] -d [***PROTOCOL***]://[***NAMENODE***]:[***PORT***]/[***DESTINATION_DIRECTORY***]

    For example:

    HDFS protocol:
    solrctl collection --export-snapshot tweets-202103281043 -s /path/to/backup-staging/tweets-202103281043 \
    -d hdfs://nn01.example.com:8020/path/to/backups
    WebHDFS protocol:
    solrctl collection --export-snapshot tweets-202103281043 -s /path/to/backup-staging/tweets-202103281043 \
    -d webhdfs://nn01.example.com:20101/path/to/backups

    To export the snapshot to the local cluster, run the following command:

    solrctl collection --export-snapshot [***NAME_OF_THE_SNAPSHOT_TO_BE_EXPORTED***] -c [***COLLECTION_NAME***] -d [***DESTINATION_DIRECTORY***]

    For example:

    solrctl collection --export-snapshot tweets-202103281043 -c tweets -d /path/to/backups/
  4. Delete the snapshot after exporting:
    solrctl collection --delete-snapshot [***NAME_OF_THE_SNAPSHOT_TO_BE_DELETED***] -c [***COLLECTION_NAME***]

    For example:

    solrctl collection --delete-snapshot tweets-202103281043 -c tweets