What's New in Cloudera Search

Learn about the new features of Cloudera Search in Cloudera Runtime 7.1.6.

Single-step metadata transition

The solr-upgrade.sh script now validates and transforms all your configuration metadata in a single step. You no longer need to individually run the script for each and every collection configuration set and then copy the files manually to the destination directory. Besides transitioning configuration files to the format of the upgrade target, the script also produces a set of *_validation.html files for all the steps of the migration that you can investigate, should the script stop on error. After fixing the incompatibility in the input folder, you can simply rerun the script as it overwrites the contents of the output directory on each run.

Local file system support

Cloudera Search now supports using local file system instead of HDFS. For more information, see Local file system support.

SolrCell reintroduced

SolrCell is reintroduced to Cloudera Search with limited functionality, which means that it can only be used with Morphlines as the JAR is not added to the Solr server classpath.

Crunch Indexer socket timeout is configurable

The Tomcat HTTP client library uses a socket timeout of 10 minutes. Spark Crunch Indexer does not override this value, and in case indexing a single batch takes more than 10 minute the entire indexing job fails. This can happen especially if the morphlines contain DeleteByQuery requests.

As of this release, you can configure the socket timeout for the connection in the morphline file by adding the solrClientSocketTimeout parameter to the solrLocator command.

For example:

SOLR_LOCATOR :
{ 
  collection : test_collection 
  zkHost : "zookeeper1.example.com:2181/solr" 
# 20 minutes in milliseconds 
  solrClientSocketTimeout: 1200000  
  # Max number of documents to pass per RPC from morphline to Solr Server
  # batchSize : 10000

}