Create a Collection for Tweets
- On a host with Solr Server installed, make sure that the
SOLR_ZK_ENSEMBLE
environment variable is set in/etc/solr/conf/solr-env.sh
.For example:cat /etc/solr/conf/solr-env.sh
export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr
This is automatically set on hosts with a Solr Server or Gateway role in Cloudera Manager.
- If you are using Kerberos,
kinit
as the user that has privileges to create the collection:kinit solr@EXAMPLE.COM
Replace
EXAMPLE.COM
with your Kerberos realm name. -
Generate the configuration files for the collection, including the
tweet-specific
schema.xml
:solrctl instancedir --generate $HOME/cloudera_tutorial_tweets_config cp /opt/cloudera/parcels/CDH/share/doc/search*/search-crunch/solr/collection1/conf/schema.xml $HOME/cloudera_tutorial_tweets_config/conf
- Upload the configuration to ZooKeeper:
-
Security
Enabled:
solrctl --jaas $HOME/jaas.conf instancedir --create cloudera_tutorial_tweets_config $HOME/cloudera_tutorial_tweets_config
- Security
Disabled:
solrctl instancedir --create cloudera_tutorial_tweets_config $HOME/cloudera_tutorial_tweets_config
-
Security
Enabled:
- Create a new collection with two shards (specified by the
-s
parameter) using the named configuration (specified by the-c
parameter):solrctl collection --create cloudera_tutorial_tweets -s 2 -c cloudera_tutorial_tweets_config
- Verify that the collection is live. Open the Solr admin web interface in a browser by
accessing the relevant URL:
- TLS Enabled:
https://search01.example.com:8985/solr/#/~cloud
- TLS Disabled:
http://search01.example.com:8983/solr/#/~cloud
If you have Kerberos authentication enabled on your cluster, enter the credentials for thesolr@EXAMPLE.COM
principal when prompted. Replace search01.example.com with the name of any host running the Solr Server process. Look for thecloudera_tutorial_tweets
collection to verify that it exists. - TLS Enabled:
- Prepare the configuration for use with
MapReduce:
cp -r $HOME/cloudera_tutorial_tweets_config $HOME/cloudera_tutorial_tweets_mr_config