This module contains a logic to plug machine learned ranking modules into Solr.
In information retrieval systems, Learning to Rank is used to re-rank the top X retrieved documents using trained machine learning models. The hope is that sophisticated models can make more nuanced ranking decisions than standard ranking functions like TF-IDF or BM25.
This module allows to plug a reranking model directly into Solr, enabling users to easily build their own learning to rank systems and access the rich matching features readily available in Solr. It also provides tools to perform feature engineering and feature extraction.
Code structure
A Learning to Rank model is plugged into the ranking through the LTRQParserPlugin
,
a QParserPlugin
. The plugin will
read from the request the model (instance of LTRScoringModel
)
used to perform the request plus other
parameters. The plugin will generate a LTRQuery
:
a particular RankQuery
that will encapsulate the given model and use it to
rescore and rerank the document (by using an LTRRescorer
).
A model will be applied on each document through a LTRScoringQuery
, a
subclass of Query
. As a normal query,
the learned model will produce a new score
for each document reranked.
A LTRScoringQuery
is created by providing an instance of
LTRScoringModel
. An instance of
LTRScoringModel
defines how to combine the features in order to create a new
score for a document. A new learning to rank model is plugged
into the framework by extending LTRScoringModel
,
(see for example MultipleAdditiveTreesModel
and LinearModel
).
The LTRScoringQuery
will take care of computing the values of
all the features (see Feature
) and then will delegate the final score
generation to the LTRScoringModel
, by calling the method
LTRScoringModel.score(float[] modelFeatureValuesNormalized)
.
A Feature
will produce a particular value for each document, so
it is modeled as a Query
. The package
org.apache.solr.ltr.feature
contains several examples
of features. One benefit of extending the Query object is that we can reuse
Query as a feature, see for example SolrFeature
.
Features for a document can also be returned in the response by
using the FeatureTransformer (a DocTransformer
)
provided by LTRFeatureLoggerTransformerFactory
.
org.apache.solr.ltr.store
contains all the logic to store all the features and the models.
Models are registered into a unique ModelStore
,
and each model specifies a particular FeatureStore
that
will contain a particular subset of features.
Features and models can be managed through a REST API, provided by the
Managed Resources
ManagedFeatureStore
and ManagedModelStore
.
Package | Description |
---|---|
org.apache.solr.ltr |
This package contains the main logic for performing the reranking using
a Learning to Rank model.
|
org.apache.solr.ltr.feature |
Contains Feature related classes
|
org.apache.solr.ltr.model |
Contains Model related classes
|
org.apache.solr.ltr.norm |
A normalizer normalizes the value of a feature.
|
org.apache.solr.ltr.response.transform |
APIs and implementations of
DocTransformer for modifying documents in Solr request responses |
org.apache.solr.ltr.search |
APIs and classes for parsing and processing search requests
|
org.apache.solr.ltr.store |
Contains feature and model store related classes.
|
org.apache.solr.ltr.store.rest |
Contains the
ManagedResource that encapsulate
the feature and the model stores. |