Known Issues in Apache Solr

Learn about the known issues in Solr, the impact or changes to the functionality, and the workaround.

Known Issues

Changing the default value of Client Connection Registry HBase configuration parameter causes HBase MRIT job to fail

If the value of the HBase configuration property Client Connection Registry is changed from the default ZooKeeper Quorum to Master Registry then the Yarn job started by HBase MRIT fails with a similar error message:

Caused by: org.apache.hadoop.hbase.exceptions.MasterRegistryFetchException: Exception making rpc to masters [,22001,-1]
        at org.apache.hadoop.hbase.client.MasterRegistry.lambda$groupCall$1(
        at org.apache.hadoop.hbase.util.FutureUtils.lambda$addListener$0(
        at java.util.concurrent.CompletableFuture.uniWhenComplete(
        at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(
        at java.util.concurrent.CompletableFuture.whenComplete(
        at org.apache.hadoop.hbase.util.FutureUtils.addListener(
        at org.apache.hadoop.hbase.client.MasterRegistry.groupCall(
        at org.apache.hadoop.hbase.client.MasterRegistry.getMetaRegionLocations(
        at org.apache.hadoop.hbase.client.ConnectionImplementation.locateMeta(
        at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(
        at org.apache.hadoop.hbase.client.ConnectionImplementation.relocateRegion(
        at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegionInMeta(
        at org.apache.hadoop.hbase.client.ConnectionImplementation.locateRegion(
        at org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(
        ... 21 more
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed contacting masters after 1 attempts.
Exceptions: Call to failed on local exception: java.lang.RuntimeException: Found no valid authentication method from options
        at org.apache.hadoop.hbase.client.MasterRegistry.lambda$groupCall$1(
        ... 35 more 
Add the following line to the MRIT command line:
-D 'hbase.client.registry.impl=org.apache.hadoop.hbase.client.ZKConnectionRegistry'
Solr does not support rolling upgrade to release 7.2.18 or lower

Solr supports rolling upgrades from release 7.2.18 and higher. Upgrading from a lower version means that all the Solr Server instances are shut down, parcels upgraded and activated and then the Solr Servers are started again. This causes a service interruption of several minutes, the actual value depending on cluster size.

Services like Atlas and Ranger that depend on Solr, may face issues because of this service interruption.

Cannot create multiple heap dump files because of file name error
Heap dump generation fails with a similar error message:
java.lang.OutOfMemoryError: Java heap space
Dumping heap to /data/tmp/solr_solr-SOLR_SERVER-fc9dacc265fabfc500b92112712505e3_pid{{PID}}.hprof ...
Unable to create /data/tmp/solr_solr-SOLR_SERVER-fc9dacc265fabfc500b92112712505e3_pid{{PID}}.hprof: File exists
The cause of the problem is that {{PID}} does not get substituted during dump file creation with an actual process ID and because of that, a generic file name is generated. This causes the next dump file creation to fail, as the existing file with the same name cannot be overwritten.
You need to manually delete the existing dump file.
Solr coreAdmin status throws Null Pointer Exception

You get a Null Pointer Exception with a similar stacktrace:

Caused by: java.lang.NullPointerException
    at org.apache.solr.core.SolrCore.getInstancePath(
    at org.apache.solr.handler.admin.CoreAdminOperation.getCoreStatus(
    at org.apache.solr.handler.admin.StatusOp.execute(
    at org.apache.solr.handler.admin.CoreAdminOperation.execute(
This is caused by an error in handling solr admin core STATUS after collections are rebuilt.
Restart the Solr server.
Applications fail because of mixed authentication methods within dependency chain of services
Using different types of authentication methods within a dependency chain, for example, configuring your indexer tool to authenticate using Kerberos and configuring your Solr Server to use LDAP for authentication may cause your application to time out and eventually fail.
Make sure that all services in a dependency chain use the same type of authentication.
API calls fail with error when used with alias, but work with collection name
API calls fail with a similar error message when used with an alias, but they work when made using the collection name:
[   ] o.a.h.s.t.d.w.DelegationTokenAuthenticationFilter Authentication exception: User: is not allowed to impersonate
  [c:RTOTagMetaOdd s:shard3 r:core_node11 x:RTOTagMetaOdd_shard3_replica_n8] o.a.h.s.t.d.w.DelegationTokenAuthenticationFilter Authentication exception: User: is not allowed to impersonate
Make sure there is a replica of the collection on every host.
CrunchIndexerTool does not work out of the box if /tmp is mounted noexec mode
When you try to run CrunchIndexerTool with the /tmp directory mounted in noexec mode, It throws a snappy-related error.

Create a separate directory for snappy temp files which is mounted with EXEC privileges and set this directory as the value of the org.xerial.snappy.tempdir java property as a driver java option.

For example:

export myDriverJarDir=/opt/cloudera/parcels/CDH//lib/solr/contrib/crunch;export myDependencyJarDir=/opt/cloudera/parcels/CDH//lib/search/lib/search-crunch;export myDriverJar=$(find $myDriverJarDir -maxdepth 1 -name 'search-crunch-*.jar' ! -name '*-job.jar' ! -name '*-sources.jar');export myDependencyJarFiles=$(find $myDependencyJarDir -name '*.jar' | sort | tr '\n' ',' | head -c -1);export myDependencyJarPaths=$(find $myDependencyJarDir -name '*.jar' | sort | tr '\n' ':' | head -c -1);export HADOOP_CONF_DIR=;spark-submit --master local --deploy-mode client --driver-library-path /opt/cloudera/parcels/CDH//lib/hadoop/lib/native/ --jars $myDependencyJarFiles --driver-java-options ' -Dorg.xerial.snappy.tempdir=/home/systest/tmp ' --class org.apache.solr.crunch.CrunchIndexerTool $myDriverJar --input-file-format=avroParquet --input-file-reader-schema search-parquetfile/parquet-schema.avsc --morphline-file /tmp/mrTestBase.conf --pipeline-type spark --chatty hdfs://[***HOSTNAME***]:8020/tmp/parquetfileparsertest-input
Apache Tika upgrade may break morphlines indexing
The upgrade of Apache Tika from 1.27 to 2.3.0 brought potentially breaking changes for morphlines indexing. Duplicate/triplicate keys names were removed and certain parser class names were changed (For example, org.apache.tika.parser.jpeg.JpegParser changed to org.apache.tika.parser.image.JpegParser).
To avoid morphline commands failing after the upgrade, do the following:
  • Check if key name changes affect your morphlines. For more information, see Removed duplicate/triplicate keys in Migrating to Tika 2.0.0.
  • Check if the name of any parser you use has changed. For more information, see the Apache Tika API documentation.
Update your morphlines if necessary.
CDPD-28432: HBase Lily indexer REST port does not support SSL
When using the --http argument for the hbase-indexer command line tool to invoke Lily indexer through REST API, you can add/list/remove indexers with any user without the need for authentication. Keeping the default true value for the hbaseindexer.httpserver.disabled environment parameter switches off the REST interface, so no one can use the --http argument when using the hbase-indexer command line tool. This also means that users need to authenticate as an hbase user in order to use the hbase-indexer tool.
CDH-77598: Indexing fails with socketTimeout

Starting from CDH 6.0, the HTTP client library used by Solr has a default socket timeout of 10 minutes. Because of this, if a single request sent from an indexer executor to Solr takes more than 10 minutes to be serviced, the indexing process fails with a timeout error.

This timeout has been raised to 24 hours. Nevertheless, there still may be use cases where even this extended timeout period proves insufficient.

If your MapreduceIndexerTool or HBaseMapreduceIndexerTool batch indexing jobs fail with a timeout error during the go-live (Live merge, MERGEINDEXES) phase (This means the merge takes longer than 24 hours).

Use the --go-live-timeout option where the timeout can be specified in milliseconds.

CDPD-12450: CrunchIndexerTool Indexing fails with socketTimeout
The http client library uses a socket timeout of 10 minutes. The Spark Crunch Indexer does not override this value, and in case a single batch takes more than 10 minutes, the entire indexing job fails. This can happen especially if the morphlines contain DeleteByQuery requests.
Try the following workarounds:
  • Check the batch size of your indexing job. Sending too large batches to Solr might increase the time needed on the Solr server to process the incoming batch.
  • If your indexing job uses deleteByQuery requests, consider using deleteById wherever possible as deleteByQuery involves a complex locking mechanism on the Solr side which makes processing the requests slower.
  • Check the number of executors for your Spark Crunch Indexer job. Too many executors can overload the Solr service. You can configure the number of executors by using the --mappers parameter
  • Check that your Solr installation is correctly sized to accommodate the indexing load, making sure that the number of Solr servers and the number of shards in your target collection are adequate.
  • The socket timeout for the connection can be configured in the morphline file. Add the solrClientSocketTimeout parameter to the solrLocator command


      collection : test_collection 
      zkHost : "zookeeper1.example.corp:2181/solr" 
    # 10 minutes in milliseconds 
      solrClientSocketTimeout: 600000  
      # Max number of documents to pass per RPC from morphline to Solr Server
      # batchSize : 10000
CDPD-29289: HBaseMapReduceIndexerTool fails with socketTimeout
The http client library uses a socket timeout of 10 minutes. The HBase Indexer does not override this value, and in case a single batch takes more than 10 minutes, the entire indexing job fails.
You can overwrite the default 600000 millisecond (10 minute) socket timeout in HBase indexer using the --solr-client-socket-timeout optional argument for the direct writing mode (when the value of the --reducers optional argument is set to 0 and mappers directly send the data to the live Solr).
Lucene index handling limitation
The Lucene index can only be upgraded by one major version. Solr 8 will not open an index that was created with Solr 6 or earlier.
There is no workaround, you need to reindex collections.
CDH-22190: CrunchIndexerTool which includes Spark indexer requires specific input file format specifications
If the --input-file-format option is specified with CrunchIndexerTool, then its argument must be text, avro, or avroParquet, rather than a fully qualified class name.
CDH-26856: Field value class guessing and Automatic schema field addition are not supported with the MapReduceIndexerTool nor with the HBaseMapReduceIndexerTool
The MapReduceIndexerTool and the HBaseMapReduceIndexerTool can be used with a Managed Schema created via NRT indexing of documents or via the Solr Schema API. However, neither tool supports adding fields automatically to the schema during ingest.
Define the schema before running the MapReduceIndexerTool or HBaseMapReduceIndexerTool. In non-schemaless mode, define in the schema using the schema.xml file. In schemaless mode, either define the schema using the Solr Schema API or index sample documents using NRT indexing before invoking the tools. In either case, Cloudera recommends that you verify that the schema is what you expect, using the List Fields API command.
Users with insufficient Solr permissions may encounter a blank Solr Web Admin UI
Users who are not authorized to use the Solr Admin UI are not given a page explaining that access is denied to them, instead they receive a blank Admin UI with no information.
CDH-15441: Using MapReduceIndexerTool or HBaseMapReduceIndexerTool multiple times may produce duplicate entries in a collection
Repeatedly running the MapReduceIndexerTool on the same set of input files can result in duplicate entries in the Solr collection. This occurs because the tool can only insert documents and cannot update or delete existing Solr documents. This issue does not apply to the HBaseMapReduceIndexerTool unless it is run with more than zero reducers.
To avoid this issue, use HBaseMapReduceIndexerTool with zero reducers. This must be done without Kerberos.
CDH-58694: Deleting collections might fail if hosts are unavailable
It is possible to delete a collection when hosts that host some of the collection are unavailable. After such a deletion, if the previously unavailable hosts are brought back online, the deleted collection may be restored.
Ensure all hosts are online before deleting collections.

Unsupported features

The following Solr features are currently not supported in Cloudera Data Platform:
  • Panel with security info in admin UI's dashboard
  • Incremental backup mode
  • Schema Designer UI
  • Package Management System
  • HTTP/2
  • Solr SQL/JDBC
  • Graph Traversal
  • Cross Data Center Replication (CDCR)
  • SolrCloud Autoscaling
  • HDFS Federation
  • Saving search results
  • Solr contrib modules

    (Spark, MapReduce, and Lily HBase indexers are not contrib modules but part of Cloudera's distribution of Solr itself, therefore they are supported)


Enabling blockcache writing may result in unusable indexes
It is possible to create indexes with solr.hdfs.blockcache.write.enabled set to true. Such indexes may appear corrupt to readers, and reading these indexes may irrecoverably corrupt them. Because of this, blockcache writing is disabled by default.
Default Solr core names cannot be changed
Although it is technically possible to give user-defined Solr core names during core creation, it is to be avoided in the context of Cloudera's distribution of Apache Solr. Cloudera Manager expects core names in the default "collection_shardX_replicaY" format. Altering core names results in Cloudera Manager being unable to fetch Solr metrics for the given core and this may corrupt data collection for co-located core, or even shard, and server level charts.
Lucene index handling limitation
The Lucene index can only be upgraded by one major version. Solr 8 will not open an index that was created with Solr 6 or earlier. Because of this, you need to reindex collections that were created with Solr 6 or earlier.