Create a collection for tweets
In this part of the Cloudera Search tutorial, you create a collection for tweets.
- On a host with Solr Server installed, make sure that the
SOLR_ZK_ENSEMBLE
environment variable is set in/etc/solr/conf/solr-env.sh
.For example:cat /etc/solr/conf/solr-env.sh
export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr
This is automatically set on hosts with a Solr Server or Gateway role in Cloudera Manager.
-
If you are using Kerberos,
kinit
as the user that has privileges to create the collection.For example:kinit solr@EXAMPLE.COM
Replace solr@EXAMPLE.COM with your user name and Kerberos realm name respectively.
-
Generate the configuration files for the collection, including the tweet-specific
managed-schema
:solrctl instancedir --generate $HOME/cloudera_tutorial_tweets_config cp /opt/cloudera/parcels/CDH/share/doc/search*/search-crunch/solr/collection1/conf/schema.xml $HOME/cloudera_tutorial_tweets_config/conf/managed_schema
To the overwrite confirmation prompt:
type 'cp: overwrite 'cloudera_tutorial_tweets_config/conf/managed-schema'?
y
'. - Upload the configuration to ZooKeeper:
solrctl config --upload cloudera_tutorial_tweets_config $HOME/cloudera_tutorial_tweets_config
- Create a new collection with two shards (specified by the
-s
parameter) using the named configuration (specified by the-c
parameter):solrctl collection --create cloudera_tutorial_tweets -s 2 -c cloudera_tutorial_tweets_config
- Verify that the collection is live. Open the Solr admin web interface in a browser by
accessing the relevant URL:
-
https://search01.example.com:8985/solr/#/~cloud
If you have Kerberos authentication enabled on your cluster, enter the credentials for thesolr@EXAMPLE.COM
principal when prompted. Replace search01.example.com with the name of any host running the Solr Server process. Look for thecloudera_tutorial_tweets
collection to verify that it exists. -
- Prepare the configuration for use with
MapReduce:
cp -r $HOME/cloudera_tutorial_tweets_config $HOME/cloudera_tutorial_tweets_mr_config