Distributed Requests

When a Solr node receives a search request, the request is routed behind the scenes to a replica of a shard that is part of the collection being searched.

The chosen replica acts as an aggregator: it creates internal requests to randomly chosen replicas of every shard in the collection, coordinates the responses, issues any subsequent internal requests as needed (for example, to refine facets values, or request additional stored fields), and constructs the final response for the client.

Limiting Which Shards are Queried

While one of the advantages of using SolrCloud is the ability to query very large collections distributed across various shards, in some cases you may have configured Solr so you know you are only interested in results from a specific subset of shards. You have the option of searching over all of your data or just parts of it.

A query across all shards for a collection is simply a query that does not define a shards parameter:


If you want to search just one shard, use the shards parameter to specify the shard by its logical ID, as in:


If you want to search a group of shards, you can specify each shard separated by a comma in one request:


In both of the above examples, while only the specific shards are queried, any random replica of the shard will get the request.

Alternatively, you can specify a list of replicas you wish to use in place of a shard IDs by separating the replica IDs with commas:


Or you can specify a list of replicas to choose from for a single shard (for load balancing purposes) by using the pipe symbol (|) between different replica IDs:


Finally, you can specify a list of shards (separated by commas) each defined by a list of replicas (seperated by pipes).

In the following example, 2 shards are queried, the first being a random replica from shard1, the second being a random replica from the explicit pipe delimited list:


Configuring the ShardHandlerFactory

For finer-grained control, you can directly configure and tune aspects of the concurrency and thread-pooling used within distributed search in Solr. The default configuration favors throughput over latency.

This is done by defining a shardHandlerFactory in the configuration for your search handler.

To add a shardHandlerFactory to the standard search handler, provide a configuration in solrconfig.xml, as in this example:

<requestHandler name="/select" class="solr.SearchHandler">
  <!-- other params go here -->
  <shardHandlerFactory class="HttpShardHandlerFactory">
    <int name="socketTimeOut">1000</int>
    <int name="connTimeOut">5000</int>

HttpShardHandlerFactory is the only ShardHandlerFactory implementation included out of the box with Solr, It accepts the following parameters:

The amount of time in ms that a socket is allowed to wait. The default is 0, where the operating system’s default will be used.
The amount of time in ms that is accepted for binding / connecting a socket. The default is 0, where the operating system’s default will be used.
The maximum number of concurrent connections that is made to each individual shard in a distributed search. The default is 100000.
The retained lowest limit on the number of threads used in coordinating distributed search. The default is 0.
The maximum number of threads used for coordinating distributed search. The default is Integer.MAX_VALUE.
The amount of time in seconds to wait for before threads are scaled back in response to a reduction in load. The default is 5.
If specified, the thread pool will use a backing queue instead of a direct handoff buffer. High throughput systems will want to configure this to be a direct hand off (with -1). Systems that desire better latency will want to configure a reasonable size of queue to handle variations in requests. The default is -1.
Chooses the JVM specifics dealing with fair policy queuing, if enabled distributed searches will be handled in a First in First out fashion at a cost to throughput. If disabled throughput will be favored over latency. The default is false.

If specified, this lists limits what nodes can be requested in the shards request parameter.

In SolrCloud mode this whitelist is automatically configured to include all live nodes in the cluster.

In standalone mode the whitelist defaults to empty (sharding not allowed).

If you need to disable this feature for backwards compatibility, you can set the system property solr.disable.shardsWhitelist=true. The value of this parameter is a comma separated list of the nodes that will be whitelisted, i.e.:,

In SolrCloud mode, if at least one node is included in the whitelist, then the live_nodes will no longer be used as source for the list. This means that if you need to do a cross-cluster request using the shards parameter in SolrCloud mode (in addition to regular within-cluster requests), you’ll need to add all nodes (local cluster + remote nodes) to the whitelist.

Configuring statsCache (Distributed IDF)

Document and term statistics are needed in order to calculate relevancy. Solr provides four implementations out of the box when it comes to document stats calculation:

  • LocalStatsCache: This only uses local term and document statistics to compute relevance. In cases with uniform term distribution across shards, this works reasonably well. This option is the default if no <statsCache> is configured.
  • ExactStatsCache: This implementation uses global values (across the collection) for document frequency.
  • ExactSharedStatsCache: This is exactly like the exact stats cache in its functionality but the global stats are reused for subsequent requests with the same terms.
  • LRUStatsCache: This implementation uses an LRU cache to hold global stats, which are shared between requests.

The implementation can be selected by setting <statsCache> in solrconfig.xml. For example, the following line makes Solr use the ExactStatsCache implementation:

<statsCache class="org.apache.solr.search.stats.ExactStatsCache"/>

Avoiding Distributed Deadlock

Each shard serves top-level query requests and then makes sub-requests to all of the other shards. Care should be taken to ensure that the max number of threads serving HTTP requests is greater than the possible number of requests from both top-level clients and other shards. If this is not the case, the configuration may result in a distributed deadlock.

For example, a deadlock might occur in the case of two shards, each with just a single thread to service HTTP requests. Both threads could receive a top-level request concurrently, and make sub-requests to each other. Because there are no more remaining threads to service requests, the incoming requests will be blocked until the other pending requests are finished, but they will not finish since they are waiting for the sub-requests. By ensuring that Solr is configured to handle a sufficient number of threads, you can avoid deadlock situations like this.

preferLocalShards Parameter

Deprecated, use shards.preference=replica.location:local instead (see below).

shards.preference Parameter

Solr allows you to pass an optional string parameter named shards.preference to indicate that a distributed query should sort the available replicas in the given order of precedence within each shard.

The syntax is: shards.preference=property:value. The order of the properties and the values are significant: the first one is the primary sort, the second is secondary, etc.

shards.preference is supported for single shard scenarios when using the SolrJ clients.

The properties that can be specified are as follows:

One or more replica types that are preferred. Any combination of PULL, TLOG and NRT is allowed.

One or more replica locations that are preferred.

A location starts with http://hostname:port. Matching is done for the given string as a prefix, so it’s possible to e.g., leave out the port.

A special value local may be used to denote any local replica running on the same Solr instance as the one handling the query. This is useful when a query requests many fields or large fields to be returned per document because it avoids moving large amounts of data over the network when it is available locally. In addition, this feature can be useful for minimizing the impact of a problematic replica with degraded performance, as it reduces the likelihood that the degraded replica will be hit by other healthy replicas.

The value of replica.location:local diminishes as the number of shards (that have no locally-available replicas) in a collection increases because the query controller will have to direct the query to non-local replicas for most of the shards.

In other words, this feature is mostly useful for optimizing queries directed towards collections with a small number of shards and many replicas.

Also, this option should only be used if you are load balancing requests across all nodes that host replicas for the collection you are querying, as Solr’s CloudSolrClient will do. If not load-balancing, this feature can introduce a hotspot in the cluster since queries won’t be evenly distributed across the cluster.


Applied after sorting by inherent replica attributes, this property defines a fallback ordering among sets of preference-equivalent replicas; if specified, only one value may be specified for this property, and it must be specified last.

random, the default, randomly shuffles replicas for each request. This distributes requests evenly, but can result in sub-optimal cache usage for shards with replication factor > 1.

stable:dividend:_paramName_ parses an integer from the value associated with the given parameter name; this integer is used as the dividend (mod equivalent replica count) to determine (via list rotation) order of preference among equivalent replicas.

stable[:hash[:_paramName_]] the string value associated with the given parameter name is hashed to a dividend that is used to determine replica preference order (analogous to the explicit dividend property above); paramName defaults to q if not specified, providing stable routing keyed to the string value of the "main query". Note that this may be inappropriate for some use cases (e.g., static main queries that leverage parameter substitution)


  • Prefer stable routing (keyed to client "sessionId" param) among otherwise equivalent replicas: shards.preference=replica.base:stable:hash:sessionId&sessionId=abc123
  • Prefer PULL replicas: shards.preference=replica.type:PULL
  • Prefer PULL replicas, or TLOG replicas if PULL replicas not available: shards.preference=replica.type:PULL,replica.type:TLOG
  • Prefer any local replicas: shards.preference=replica.location:local
  • Prefer any replicas on a host called "server1" with "server2" as the secondary option: shards.preference=replica.location:http://server1,replica.location:http://server2
  • Prefer PULL replicas if available, otherwise TLOG replicas, and local ones among those: shards.preference=replica.type:PULL,replica.type:TLOG,replica.location:local
  • Prefer local replicas, and among them PULL replicas when available TLOG otherwise: shards.preference=replica.location:local,replica.type:PULL,replica.type:TLOG

Note that if you provide the settings in a query string, they need to be properly URL-encoded.