Apache Solr Search Server: Learning to Rank Contrib

This module contains a logic to plug machine learned ranking modules into Solr.

In information retrieval systems, Learning to Rank is used to re-rank the top X retrieved documents using trained machine learning models. The hope is that sophisticated models can make more nuanced ranking decisions than standard ranking functions like TF-IDF or BM25.

This module allows to plug a reranking model directly into Solr, enabling users to easily build their own learning to rank systems and access the rich matching features readily available in Solr. It also provides tools to perform feature engineering and feature extraction.

Code structure

A Learning to Rank model is plugged into the ranking through the LTRQParserPlugin, a QParserPlugin. The plugin will read from the request the model (instance of LTRScoringModel) used to perform the request plus other parameters. The plugin will generate a LTRQuery: a particular RankQuery that will encapsulate the given model and use it to rescore and rerank the document (by using an LTRRescorer).

A model will be applied on each document through a LTRScoringQuery, a subclass of Query. As a normal query, the learned model will produce a new score for each document reranked.

A LTRScoringQuery is created by providing an instance of LTRScoringModel. An instance of LTRScoringModel defines how to combine the features in order to create a new score for a document. A new learning to rank model is plugged into the framework by extending LTRScoringModel, (see for example MultipleAdditiveTreesModel and LinearModel).

The LTRScoringQuery will take care of computing the values of all the features (see Feature) and then will delegate the final score generation to the LTRScoringModel, by calling the method LTRScoringModel.score(float[] modelFeatureValuesNormalized).

A Feature will produce a particular value for each document, so it is modeled as a Query. The package org.apache.solr.ltr.feature contains several examples of features. One benefit of extending the Query object is that we can reuse Query as a feature, see for example SolrFeature. Features for a document can also be returned in the response by using the FeatureTransformer (a DocTransformer) provided by LTRFeatureLoggerTransformerFactory.

org.apache.solr.ltr.store contains all the logic to store all the features and the models. Models are registered into a unique ModelStore, and each model specifies a particular FeatureStore that will contain a particular subset of features.

Features and models can be managed through a REST API, provided by the Managed Resources ManagedFeatureStore and ManagedModelStore.

Packages 
Package Description
org.apache.solr.ltr
This package contains the main logic for performing the reranking using a Learning to Rank model.
org.apache.solr.ltr.feature
Contains Feature related classes
org.apache.solr.ltr.model
Contains Model related classes
org.apache.solr.ltr.norm
A normalizer normalizes the value of a feature.
org.apache.solr.ltr.response.transform
APIs and implementations of DocTransformer for modifying documents in Solr request responses
org.apache.solr.ltr.search
APIs and classes for parsing and processing search requests
org.apache.solr.ltr.store
Contains feature and model store related classes.
org.apache.solr.ltr.store.rest
Contains the ManagedResource that encapsulate the feature and the model stores.