Metrics Reporting and Monitoring

Solr supports both a pull-based Prometheus-formatted API and an OTLP push exporter for collecting detailed performance-oriented metrics throughout the lifecycle of Solr services and their various components.

All metrics natively include attributes/labels, providing users with powerful ways to aggregate metrics in their preferred backend, as well as descriptions to help understand what each metric represents.

Internally this feature uses OpenTelemetry, which uses the following instruments to measure events. For more information on instruments see the OpenTelemetry documentation on Metric Instruments.

Solr metrics provide raw data that must be aggregated and calculated by monitoring backends (Prometheus, Grafana, etc.). Counters can be use to calculate rates and averages over time windows. Histograms provide raw bucket data that backends use to calculate percentiles (p50, p75, p95, p99, p999), averages, and other statistical measures. Solr delegates these calculations to your monitoring system for better flexibility and reduced load on Solr. See Performance Statistics Reference for examples with PromQL on how to calculate these rates and percentiles.

Some metrics and/or events may be missing or empty if the metric was never recorded and therefore won’t appear. Specifically, metrics will not appear if the triggering event never occurs.

Metric Registries

Internally, Solr categorizes metrics into registries that group related metrics together. While users don’t need to understand these registries and should focus on the attributes attached to metrics for aggregation, being aware of them can be helpful for knowing what metrics are available and their base set of attributes.

Metrics are maintained and accumulated throughout all lifecycle phases of components, from process startup until shutdown. For example, metrics for a particular SolrCore are tracked through multiple load, unload, and rename operations, and are only deleted when a core is explicitly deleted. However, metrics are not persisted across process restarts; restarting Solr will discard all collected metrics.

These are the major groups of metrics that are collected:

Node / CoreContainer Registry

The Node Registry records metrics at the process level that are not specific to any core. Metric names are prefixed with solr_node_ and include the following information:

  • handler requests (count, timing): collections, info, admin, configsets, etc.

  • number of cores (permanent, unloaded)

Core (SolrCore) Registry

The Core (SolrCore) Registry includes all core-level metrics with one registry for each core. All core metrics are prefixed with solr_core_ in the name. In addition to the prefix, all core metrics have the following standard set of core attributes attached for aggregation:

In cloud mode:

core

Name of the core.

collection

Name of the collection.

replica_type

The type of replica. This can be NRT/TLOG/PULL.

shard

The name of the shard.

In standalone mode, only the core attribute is attached to the metrics.

Some examples of metrics available from the core registry:

  • All common RequestHandlers report request timers/counters, timeouts, and errors. Handlers that process distributed shard requests include a boolean internal attribute for each type of distributed request, differentiating between external client requests and internal requests.

  • Index-level events: meters for minor/major merges, number of merged documents, number of deleted documents, and number of flushes.

  • Shard replication and transaction log replay on replicas.

RequestHandlers can be configured to roll up core-level metrics to the node level in addition to reporting them per core. This is useful when you have a large number of cores per node and are interested in aggregate metrics per node.

These metrics are prefixed with solr_node, include the handler attribute, and omit the standard core attributes.

This is configured by adding <bool name="aggregateNodeLevelMetricsEnabled">true</bool> to a RequestHandler configuration in your solrconfig.xml, for example:

<requestHandler name="/select" class="solr.SearchHandler">
    <!-- default values for query parameters can be specified, these
         will be overridden by parameters in the request
      -->
    <lst name="defaults">
        <str name="echoParams">explicit</str>
        <int name="rows">10</int>
    </lst>

    <bool name="aggregateNodeLevelMetricsEnabled">true</bool>
</requestHandler>

JVM Registry

The JVM Registry gathers metrics from the JVM using the OpenTelemetry instrumentation library with JFR and JMX. See the runtime-telemetry-java17 documentation for more information on available JVM metrics.

JVM metrics are enabled by default but can be disabled by setting either the system property -Dsolr.metrics.jvm.enabled=false or the environment variable SOLR_METRICS_JVM_ENABLED=false.

Overseer Registry

The Overseer Registry is initialized when running in SolrCloud mode and includes the following information:

  • Size of the Overseer queues (collection work queue and cluster state update queue)

Core Level Metrics

Index Merge Metrics

These metrics are collected under the INDEX category and track flush operations (documents being written to disk) and merge operations (segments on disk being merged).

For merge metrics, metrics are tracked with the distinction of "minor" and "major" merges (as merges with fewer documents will be typically more frequent). This is indicated by the merge_type label for the metric. The threshold for when a merge becomes large enough to be considered major is configurable, but defaults to 524k documents.

Metrics collection for index merges can be configured in the <metrics> section of solrconfig.xml as shown below:

<config>
  ...
  <indexConfig>
    <metrics>
      <long name="majorMergeDocs">524288</long>
    </metrics>
    ...
  </indexConfig>
...
</config>

Metrics API

The /admin/metrics endpoint natively provides access to all metrics in Prometheus format by default. You can also specify wt=prometheus as a parameter for Prometheus format or wt=openmetrics for OpenMetrics format. More information on the data models is provided in the sections below.

Prometheus

See Prometheus Data Model documentation for more information on its data model.

This endpoint can be used to pull/scrape metrics to a Prometheus server or any Prometheus-compatible backend directly from Solr.

Prometheus Setup

The prometheus-config.yml file needs to be configured for a Prometheus server to scrape and collect metrics. A basic configuration for SolrCloud mode is as follows:

scrape_configs:
  - job_name: 'solr'
    metrics_path: "/solr/admin/metrics"
    static_configs:
      - targets: ['localhost:8983', 'localhost:7574']

OpenMetrics

OpenMetrics format is available from the /admin/metrics endpoint by providing the wt=openmetrics parameter or by passing the Accept header application/openmetrics-text;version=1.0.0. OpenMetrics is an extension of the Prometheus format that adds additional metadata and exemplars.

See OpenMetrics Spec documentation for more information.

OpenMetrics can be used to pull/scrape metrics to a Prometheus server or any OpenMetrics-compatible backend directly from Solr.

Prometheus setup with exemplars

OpenMetrics includes exemplars that provide additional information and allow users to leverage Solr’s OpenTelemetry distributed tracing module and metrics in a cohesive view for correlating traces and metrics.

Distributed tracing must be enabled to see exemplars. Exemplars will never appear in OpenMetrics format otherwise. You can then scrape OpenMetrics format to a Prometheus server or OpenMetrics-compatible backend.

A basic prometheus-config.yml configuration for a Prometheus server in SolrCloud mode that collects exemplars is as follows:

scrape_configs:
  - job_name: 'solr'
    metrics_path: "/solr/admin/metrics"
    static_configs:
      - targets: ['localhost:8983', 'localhost:7574']
    params:
      wt: ['openmetrics']
    scrape_protocols:
      - OpenMetricsText1.0.0

The Prometheus server must also be started with the command-line parameter --enable-feature=exemplar-storage to collect exemplars from OpenMetrics.

If you are using Grafana, follow the Introduction to exemplars guide to connect your Prometheus data source and see exemplars on Grafana panels.

API Filtering

A fixed set of parameters is available to filter metrics by either metric name or base core labels. You can combine these parameters to filter only the specific metrics you need:

All parameters can be specified with more than one value in a request; multiple values should be separated by a comma.

name

Optional

Default: none

The metric name to filter on.

category

Optional

Default: none

The category label to filter on.

core

Optional

Default: none

The core name to filter on. More than one core can be specified in a request; multiple cores should be separated by a comma.

collection

Optional

Default: none

The collection name to filter on. This attribute is only filterable in SolrCloud mode.

shard

Optional

Default: none

The shard name to filter on. This attribute is only filterable in SolrCloud mode.

replica_type

Optional

Default: none

The replica type to filter on. Valid values are NRT, TLOG, or PULL. This attribute is only filterable in SolrCloud mode.

Examples

Request only metrics from the foobar collection:

http://localhost:8983/solr/admin/metrics?collection=foobar

Request only the metrics with a category label of QUERY or UPDATE:

http://localhost:8983/solr/admin/metrics?category=QUERY,UPDATE

Request only solr_core_requests_total metrics from the foobar_shard1_replica_n1 core:

http://localhost:8983/solr/admin/metrics?name=solr_core_requests_total&core=foobar_shard1_replica_n1

Request only the core index size solr_core_index_size_bytes metrics from collections labeled foo and bar:

http://localhost:8983/solr/admin/metrics?name=solr_core_index_size_bytes&collection=foo,bar

OTLP

For users who do not use or support pulling metrics in Prometheus format with the /admin/metrics API, Solr also supports pushing metrics natively with OTLP, which is a vendor-agnostic protocol for pushing metrics via gRPC or HTTP.

OTLP is widely supported by many tools, vendors, and pipelines. See the OpenTelemetry vendors list for more details on available and compatible options.

OTLP properties

Solr’s internal OTLP exporter is disabled by default and is packaged with the OpenTelemetry module.

The module can be enabled with either the system property -Dsolr.modules=opentelemetry or the environment variable SOLR_MODULES=opentelemetry, similar to distributed tracing.

The OTLP exporter can be configured with the supported system properties below. These can also be set as environment variables by following these mapping rules:

  • Replace . with _

  • Convert camelCase to UPPER_SNAKE_CASE

  • Make all letters uppercase

    solr.metrics.otlpExporterEnabled

    Optional

    Default: false

    Boolean value to enable or disable the OTLP metrics exporter.

    solr.metrics.otlpExporterProtocol

    Optional

    Default: grpc

    OTLP protocol to use for pushing metrics. Available options are grpc, http, or none (disabled).

    solr.metrics.otlpExporterInterval

    Optional

    Default: 60000

    The interval in milliseconds for how frequently metrics are pushed via OTLP.

    solr.metrics.otlpGrpcExporterEndpoint

    Optional

    Default: http://localhost:4317

    Endpoint to send OTLP metrics to using the gRPC protocol.

    solr.metrics.otlpHttpExporterEndpoint

    Optional

    Default: http://localhost:4318/v1/metrics

    Endpoint to send OTLP metrics to using the HTTP protocol.

OpenTelemetry Collector setup

The OpenTelemetry Collector is a powerful process that allows users to decouple their metrics pipeline and route to their preferred backend. It natively supports metrics being pushed to it via OTLP and/or scraping the /admin/metrics Prometheus endpoint supported by Solr. You can push both metrics and traces to the collector via OTLP as a single pipeline.

A simple setup to route metrics from Solr → OpenTelemetry Collector → Prometheus can be configured with the following OpenTelemetry Collector configuration file:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

exporters:
  prometheus:
    endpoint: 0.0.0.0:9464
    send_timestamps: true
    enable_open_metrics: true

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]

You can then request the metrics in Prometheus format from the collector:

curl 'localhost:9464/metrics'

Or request OpenMetrics format to also see exemplars by passing the Accept header:

curl 'localhost:9464/metrics' -H 'Accept: application/openmetrics-text; version=1.0.0'