What's New in Cloudera Search

Learn about the new features of Cloudera Search in Cloudera Runtime 7.2.8.

Local file system support

Cloudera Search now supports using local file system instead of HDFS. For more information, see Local file system support.

SolrCell reintroduced

SolrCell is reintroduced to Cloudera Search with limited functionality, which means that it can only be used with Morphlines as the JAR is not added to the Solr server classpath.

Crunch Indexer socket timeout is configurable

The Tomcat HTTP client library uses a socket timeout of 10 minutes. Spark Crunch Indexer does not override this value, and in case indexing a single batch takes more than 10 minute the entire indexing job fails. This can happen especially if the morphlines contain DeleteByQuery requests.

As of this release, you can configure the socket timeout for the connection in the morphline file by adding the solrClientSocketTimeout parameter to the solrLocator command.

For example:

  collection : test_collection 
  zkHost : "zookeeper1.example.com:2181/solr" 
# 20 minutes in milliseconds 
  solrClientSocketTimeout: 1200000  
  # Max number of documents to pass per RPC from morphline to Solr Server
  # batchSize : 10000