Distributed Requests

When a Solr node receives a search request, the request is routed behind the scenes to a replica of a shard that is part of the collection being searched.

The chosen replica acts as an aggregator: it creates internal requests to randomly chosen replicas of every shard in the collection, coordinates the responses, issues any subsequent internal requests as needed (for example, to refine facets values, or request additional stored fields), and constructs the final response for the client.

Limiting Which Shards are Queried

While one of the advantages of using SolrCloud is the ability to query very large collections distributed among various shards, in some cases you may know that you are only interested in results from a subset of your shards. You have the option of searching over all of your data or just parts of it.

Querying all shards for a collection should look familiar; it’s as though SolrCloud didn’t even come into play:


If, on the other hand, you wanted to search just one shard, you can specify that shard by its logical ID, as in:


If you want to search a group of shard Ids, you can specify them together:


In both of the above examples, the shard Id(s) will be used to pick a random replica of that shard.

Alternatively, you can specify the explicit replicas you wish to use in place of a shard Ids:


Or you can specify a list of replicas to choose from for a single shard (for load balancing purposes) by using the pipe symbol (|):


And of course, you can specify a list of shards (separated by commas) each defined by a list of replicas (seperated by pipes). In this example, 2 shards are queried, the first being a random replica from shard1, the second being a random replica from the explicit pipe delimited list:


Configuring the ShardHandlerFactory

You can directly configure aspects of the concurrency and thread-pooling used within distributed search in Solr. This allows for finer grained control and you can tune it to target your own specific requirements. The default configuration favors throughput over latency.

To configure the standard search handler, provide a configuration like this in solrconfig.xml:

<requestHandler name="/select" class="solr.SearchHandler">
  <!-- other params go here -->
  <shardHandler class="HttpShardHandlerFactory">
    <int name="socketTimeOut">1000</int>
    <int name="connTimeOut">5000</int>

The parameters that can be specified are as follows:


The amount of time in ms that a socket is allowed to wait. The default is 0, where the operating system’s default will be used.


The amount of time in ms that is accepted for binding / connecting a socket. The default is 0, where the operating system’s default will be used.


The maximum number of concurrent connections that is made to each individual shard in a distributed search. The default is 100000.


The total maximum number of concurrent connections in distributed searches. The default is 100000


The retained lowest limit on the number of threads used in coordinating distributed search. The default is 0.


The maximum number of threads used for coordinating distributed search. The default is Integer.MAX_VALUE.


The amount of time in seconds to wait for before threads are scaled back in response to a reduction in load. The default is 5.


If specified, the thread pool will use a backing queue instead of a direct handoff buffer. High throughput systems will want to configure this to be a direct hand off (with -1). Systems that desire better latency will want to configure a reasonable size of queue to handle variations in requests. The default is -1.


Chooses the JVM specifics dealing with fair policy queuing, if enabled distributed searches will be handled in a First in First out fashion at a cost to throughput. If disabled throughput will be favored over latency. The default is false.

Configuring statsCache (Distributed IDF)

Document and term statistics are needed in order to calculate relevancy. Solr provides four implementations out of the box when it comes to document stats calculation:

  • LocalStatsCache: This only uses local term and document statistics to compute relevance. In cases with uniform term distribution across shards, this works reasonably well. This option is the default if no <statsCache> is configured.

  • ExactStatsCache: This implementation uses global values (across the collection) for document frequency.

  • ExactSharedStatsCache: This is exactly like the exact stats cache in its functionality but the global stats are reused for subsequent requests with the same terms.

  • LRUStatsCache: This implementation uses an LRU cache to hold global stats, which are shared between requests.

The implementation can be selected by setting <statsCache> in solrconfig.xml. For example, the following line makes Solr use the ExactStatsCache implementation:

<statsCache class="org.apache.solr.search.stats.ExactStatsCache"/>

Avoiding Distributed Deadlock

Each shard serves top-level query requests and then makes sub-requests to all of the other shards. Care should be taken to ensure that the max number of threads serving HTTP requests is greater than the possible number of requests from both top-level clients and other shards. If this is not the case, the configuration may result in a distributed deadlock.

For example, a deadlock might occur in the case of two shards, each with just a single thread to service HTTP requests. Both threads could receive a top-level request concurrently, and make sub-requests to each other. Because there are no more remaining threads to service requests, the incoming requests will be blocked until the other pending requests are finished, but they will not finish since they are waiting for the sub-requests. By ensuring that Solr is configured to handle a sufficient number of threads, you can avoid deadlock situations like this.

preferLocalShards Parameter

Deprecated, use shards.preference=replica.location:local instead (see below).

shards.preference Parameter

Solr allows you to pass an optional string parameter named shards.preference to indicate that a distributed query should sort the available replicas in the given order of precedence within each shard.

The syntax is: shards.preference=property:value. The order of the properties and the values are significant: the first one is the primary sort, the second is secondary, etc.

shards.preference only works for distributed queries, i.e., queries targeting multiple shards. Single shard scenarios are not supported.

The properties that can be specified are as follows:


One or more replica types that are preferred. Any combination of PULL, TLOG and NRT is allowed.


One or more replica locations that are preferred.

A location starts with http://hostname:port. Matching is done for the given string as a prefix, so it’s possible to e.g., leave out the port.

A special value local may be used to denote any local replica running on the same Solr instance as the one handling the query. This is useful when a query requests many fields or large fields to be returned per document because it avoids moving large amounts of data over the network when it is available locally. In addition, this feature can be useful for minimizing the impact of a problematic replica with degraded performance, as it reduces the likelihood that the degraded replica will be hit by other healthy replicas.

The value of replica.location:local diminishes as the number of shards (that have no locally-available replicas) in a collection increases because the query controller will have to direct the query to non-local replicas for most of the shards.

In other words, this feature is mostly useful for optimizing queries directed towards collections with a small number of shards and many replicas.

Also, this option should only be used if you are load balancing requests across all nodes that host replicas for the collection you are querying, as Solr’s CloudSolrClient will do. If not load-balancing, this feature can introduce a hotspot in the cluster since queries won’t be evenly distributed across the cluster.


  • Prefer PULL replicas: shards.preference=replica.type:PULL

  • Prefer PULL replicas, or TLOG replicas if PULL replicas not available: shards.preference=replica.type:PULL,replica.type:TLOG

  • Prefer any local replicas: shards.preference=replica.location:local

  • Prefer any replicas on a host called "server1" with "server2" as the secondary option: shards.preference=replica.location:http://server1,replica.location:http://server2

  • Prefer PULL replicas if available, otherwise TLOG replicas, and local ones among those: shards.preference=replica.type:PULL,replica.type:TLOG,replica.location:local

  • Prefer local replicas, and among them PULL replicas when available TLOG otherwise: shards.preference=replica.location:local,replica.type:PULL,replica.type:TLOG

Note that if you provide the settings in a query string, they need to be properly URL-encoded.