This module contains a logic to plug machine learned ranking modules into Solr.
In information retrieval systems, Learning to Rank is used to re-rank the top X retrieved documents using trained machine learning models. The hope is that sophisticated models can make more nuanced ranking decisions than standard ranking functions like TF-IDF or BM25.
This module allows to plug a reranking model directly into Solr, enabling users to easily build their own learning to rank systems and access the rich matching features readily available in Solr. It also provides tools to perform feature engineering and feature extraction.
A Learning to Rank model is plugged into the ranking through the
QParserPlugin. The plugin will
read from the request the model (instance of
used to perform the request plus other
parameters. The plugin will generate a
that will encapsulate the given model and use it to
rescore and rerank the document (by using an
A model will be applied on each document through a
Query. As a normal query,
the learned model will produce a new score
for each document reranked.
LTRScoringQuery is created by providing an instance of
LTRScoringModel. An instance of
defines how to combine the features in order to create a new
score for a document. A new learning to rank model is plugged
into the framework by extending
(see for example
LTRScoringQuery will take care of computing the values of
all the features (see
Feature) and then will delegate the final score
generation to the
LTRScoringModel, by calling the method
Feature will produce a particular value for each document, so
it is modeled as a
Query. The package
org.apache.solr.ltr.feature contains several examples
of features. One benefit of extending the Query object is that we can reuse
Query as a feature, see for example
Features for a document can also be returned in the response by
using the FeatureTransformer (a
org.apache.solr.ltr.store contains all the logic to store all the features and the models.
Models are registered into a unique
and each model specifies a particular
will contain a particular subset of features.
Features and models can be managed through a REST API, provided by the
This package contains the main logic for performing the reranking using a Learning to Rank model.
Contains Feature related classes
Contains Model related classes
A normalizer normalizes the value of a feature.
APIs and implementations of
DocTransformer for modifying documents in Solr request responses
Contains feature and model store related classes.
ManagedResource that encapsulate
the feature and the model stores.