Circuit Breakers

Solr’s circuit breaker infrastructure allows prevention of actions that can cause a node to go beyond its capacity or to go down. The premise of circuit breakers is to ensure a higher quality of service and only accept request loads that are serviceable in the current resource configuration.

When To Use Circuit Breakers

Circuit breakers should be used when the user wishes to trade request throughput for a higher Solr stability. If circuit breakers are enabled, requests may be rejected under the condition of high node duress with an appropriate HTTP error code (typically 503).

It is up to the client to handle this error and potentially build a retrial logic as this should ideally be a transient situation.

Circuit Breaker Configurations

All circuit breaker configurations are listed in the <circuitBreaker> tags in solrconfig.xml as shown below:

<circuitBreaker class="solr.CircuitBreakerManager" enabled="true">
  <!-- All specific configs in this section -->
</circuitBreaker>

The enabled attribute controls the global activation/deactivation of circuit breakers. If this flag is disabled, all circuit breakers will be disabled globally. Per circuit breaker configurations are specified in their respective sections later.

This attribute acts as the highest authority and global controller of circuit breakers. For using specific circuit breakers, each one needs to be individually enabled in addition to this flag being enabled.

CircuitBreakerManager is the default manager for all circuit breakers that should be defined in the tag unless the user wishes to use a custom implementation.

Currently Supported Circuit Breakers

JVM Heap Usage

This circuit breaker tracks JVM heap memory usage and rejects incoming search requests with a 503 error code if the heap usage exceeds a configured percentage of maximum heap allocated to the JVM (-Xmx). The main configuration for this circuit breaker is controlling the threshold percentage at which the breaker will trip.

Configuration for JVM heap usage based circuit breaker:

<str name="memEnabled">true</str>

Note that this configuration will be overridden by the global circuit breaker flag — if circuit breakers are disabled, this flag will not help you.

The triggering threshold is defined as a percentage of the max heap allocated to the JVM. It is controlled by the below configuration:

<str name="memThreshold">75</str>

It does not logically make sense to have a threshold below 50% and above 95% of the max heap allocated to the JVM. Hence, the range of valid values for this parameter is [50, 95], both inclusive.

Consider the following example:

JVM has been allocated a maximum heap of 5GB (-Xmx) and memoryCircuitBreakerThresholdPct is set to 75. In this scenario, the heap usage at which the circuit breaker will trip is 3.75GB.

CPU Utilization

This circuit breaker tracks CPU utilization and triggers if the average CPU utilization over the last one minute exceeds a configurable threshold. Note that the value used in computation is over the last one minute — so a sudden spike in traffic that goes down might still cause the circuit breaker to trigger for a short while before it resolves and updates the value. For more details of the calculation, please see https://en.wikipedia.org/wiki/Load_(computing)

Configuration for CPU utilization based circuit breaker:

<str name="cpuEnabled">true</str>

Note that this configuration will be overridden by the global circuit breaker flag — if circuit breakers are disabled, this flag will not help you.

The triggering threshold is defined in units of CPU utilization. The configuration to control this is as below:

<str name="cpuThreshold">75</str>

Performance Considerations

It is worth noting that while JVM or CPU circuit breakers do not add any noticeable overhead per query, having too many circuit breakers checked for a single request can cause a performance overhead.

In addition, it is a good practice to exponentially back off while retrying requests on a busy node.