Managing collections in Search

A collection in Cloudera Search refers to a repository for indexing and querying documents. Collections typically contain the same types of documents with similar schemas.

To start using Solr and indexing data, you must configure a collection to hold the index.

A collection requires the following configuration files:
  • solrconfig.xml
  • schema.xml
  • Any additional files referenced in the xml files
The solrconfig.xml file contains all of the Solr settings for a given collection, and the schema.xml file specifies the schema that Solr uses when indexing documents. For more details on how to configure a collection, see SchemaXml.
A typical deployment workflow with solrctl consists of:
  1. Establishing a configuration.
    • If using configs, creating a config object from a template.
    • If using instance directories, generating an instance directory and uploading it to ZooKeeper.
  2. Creating a collection associated with the name of the config or instance directory.

Collections are managed using the solrctl commandline utility.