Distributed Solr Tracing

Solr includes a general tracing framework based on OpenTracing that can be used to trace lifecycle of a request for performance monitoring. Tracing data can be configured and send to arbitrary backend like Jaeger, Zipkin, Datadog, etc. For now only Jaeger is supported out of the box.

A sampled distributed tracing query request on Jaeger looks like this:

image
Figure 1. Tracing of a solr query

Setup Tracer

TracerConfigurator is a class used for provide an instance of io.opentracing.Tracer based on configuration in solr.xml. Like JaegerTracerConfigurator provide JaegerTracer instance to Solr node.

A setup of a TracerConfigurator looks like this

<solr>
  <tracerConfig name="tracerConfig" class="org.apache.solr.jaeger.JaegerTracerConfigurator">
    <str name="agentHost">localhost</str>
    <int name="agentPort">5775</int>
    <bool name="logSpans">true</bool>
    <int name="flushInterval">1000</int>
    <int name="maxQueueSize">10000</int>
  </tracerConfig>
</solr>

If <tracerConfig> part is absent, TracerConfigurator will try to pick up the Tracer instance was registered in in io.opentracing.util.GlobalTracer. By doing this some backend like DataDog is supported out of the box since datadog-java-agent use Javaagent to register a Tracer in io.opentracing.util.GlobalTracer.

Configuring Sample Rate

By default only 0.1% of requests are sampled, this ensure that tracing activities does not affect performance of a node.

The rate can be changed on the fly (without restarting Solr nodes) by setting new sample rate in cluster property. For example, below call set sample rate to 100%

/admin/collections?action=CLUSTERPROP&name=samplePercentage&val=100

Jaeger Tracer Configurator

Module contrib/jagertracer-configurator provides a default implementation for setting up Jaeger Tracer. Note that all library of jaegertracer-configurator must be included in the classpath of all nodes then Jaeger tracer can be setup in solr.xml like this:

<tracerConfig name="tracerConfig" class="org.apache.solr.jaeger.JaegerTracerConfigurator">
  <str name="agentHost">localhost</str>
  <int name="agentPort">5775</int>
  <bool name="logSpans">true</bool>
  <int name="flushInterval">1000</int>
  <int name="maxQueueSize">10000</int>
</tracerConfig>

List of parameters for JaegerTracerConfigurator include:

ParameterTypeRequiredDefaultDescription
agentHoststringYesThe host of Jaeger backend
agentPortintYesThe port of Jaeger port
logsSpansboolNotrueWhether the tracer should also log the spans
flushIntervalintNo5000The tracer’s flush interval (ms)
maxQueueSizeintNo10000The tracer’s maximum queue size

Other parameters which are not listed above can be configured using System Properties or Environment Variables. The full list are listed at Jaeger-README.