Solr Cluster Types

A Solr cluster is a group of servers (nodes) that each run Solr.

There are two general modes of operating a cluster of Solr nodes. One mode provides central coordination of the Solr nodes (SolrCloud Mode), while the other allows you to operate a cluster without this central coordination (User-Managed Mode).

Both modes share general concepts, but ultimately differ in how those concepts are reflected in functionality and features.

First let’s cover a few general concepts and then outline the differences between the two modes.

Cluster Concepts

Shards

In both cluster modes, a single logical index can be split across nodes as shards. Each shard contains a subset of overall index.

The number of shards dictates the theoretical limit to the number of documents that can be indexed to Solr. It also determines the amount of parallelization possible for an individual search request.

Replicas

In order to provide some failover for each shard, each shard can be copied as a replica. A replica has the same configuration as the shard and any other replicas for the same index.

It’s possible to have replicas without having created shards. In this case, each replica would be a full copy of the entire index, instead of being only a copy of a part of the entire index.

The number of replicas determines the level of fault tolerance the entire cluster has in the event of a node failure. It also dictates the theoretical limit on the number of concurrent search requests that can be processed under heavy load.

Leaders

Once replicas have been created, a leader must be identified. The responsibility of the leader is to be a source-of-truth for each replica. When updates are made to the index, they are first processed by the leader and then by each replica (the exact mechanism for how this happens varies).

The replicas which are not leaders are followers.

Cores

Each replica, whether it is a leader or a follower, is called a core. Multiple cores can be hosted on any one node.

SolrCloud Mode

SolrCloud mode (also called "SolrCloud") uses Apache ZooKeeper to provide the centralized cluster management that is its main feature. ZooKeeper tracks each node of the cluster and the state of each core on each node.

In this mode, configuration files are stored in ZooKeeper and not on the file system of each node. When configuration changes are made, they must be uploaded to ZooKeeper, which in turn makes sure each node knows changes have been made.

SolrCloud introduces an additional concept, a collection. A collection is the entire group of cores that represent an index: the logical shards and the physical replicas for each shard. Collections all share the same configurations (schema, solrconfig.xml, etc.). This is an additional centralization of the cluster management, as operations can be performed on the entire collection at one time.

When changes are made to configurations, a single command to reload the collection would automatically reload each individual core that is a member of the collection.

Sharding is handled automatically, simply by telling Solr during collection creation how many shards you’d like the collection to have. Index updates are then generally balanced between each shard automatically. Some degree of control over what documents are stored in which shards is also available, if needed.

ZooKeeper also handles load balancing and failover. Incoming requests, either to index documents or for user queries, can be sent to any node of the cluster and ZooKeeper will route the request to an appropriate replica of each shard.

In SolrCloud, the leader is flexible, with built-in mechanisms for automatic leader election in case of failure in the leader. This means another core can become the leader, and from that point forward it is the source-of-truth for all replicas.

As long as one replica of each relevant shard is available, a user query or indexing request can still be satisfied when running in SolrCloud mode.

User-Managed Mode

Solr’s user-managed mode requires that cluster coordination activities that SolrCloud normally uses ZooKeeper for are performed manually or with local scripts.

If the corpus of documents is too large for a single-sharded index, the logic to create shards is entirely left to the user. There are no automated or programmatic ways for Solr to create shards during indexing.

Routing documents to shards are handled manually, either with a simple hashing system or a simple round-robin list of shards that sends each document to a different shard. Document updates must be sent to the right shard or duplicate documents could result.

In user-managed mode, the concept of leader and follower becomes critical. Identifying which node will host the leader replica and which host(s) will be replicas dictate how each node is configured. In this mode, all index updates are sent to the leader only. Once the leader has completed indexing, the replica will request the index updates and copy them from the leader.

Load balancing is achieved with an external tool or process, unless request traffic can be managed by the leader or one of its replicas alone.

If the leader goes down, there is no built-in failover mechanism. A replica could continue to serve queries if the queries were specifically directed to it. Changing a replica to serve as the leader would require changing solrconfig.xml configurations on all replicas and reloading each core.

User-managed mode has no concept of a collection, so for all intents and purposes each Solr node is distinct from other nodes. Only some configuration parameters keep each node from behaving as independent entities.