Relevance is the degree to which a query response satisfies a user who is searching for information.
The relevance of a query response depends on the context in which the query was performed. A single search application may be used in different contexts by users with different needs and expectations. For example, a search engine of climate data might be used by a university researcher studying long-term climate trends, a farmer interested in calculating the likely date of the last frost of spring, a civil engineer interested in rainfall patterns and the frequency of floods, and a college student planning a vacation to a region and wondering what to pack. Because the motivations of these users vary, the relevance of any particular response to a query will vary as well.
How comprehensive should query responses be? Like relevance in general, the answer to this question depends on the context of a search. The cost of not finding a particular document in response to a query is high in some contexts, such as a legal e-discovery search in response to a subpoena, and quite low in others, such as a search for a cake recipe on a Web site with dozens or hundreds of cake recipes. When configuring Solr, you should weigh comprehensiveness against other factors such as timeliness and ease-of-use.
The e-discovery and recipe examples demonstrate the importance of two concepts related to relevance:
Precision is the percentage of documents in the returned results that are relevant.
Recall is the percentage of relevant results returned out of all relevant results in the system. Obtaining perfect recall is trivial: simply return every document in the collection for every query.
Returning to the examples above, it’s important for an e-discovery search application to have 100% recall returning all the documents that are relevant to a subpoena. It’s far less important that a recipe application offer this degree of precision, however. In some cases, returning too many results in casual contexts could overwhelm users. In some contexts, returning fewer results that have a higher likelihood of relevance may be the best approach.
Using the concepts of precision and recall, it’s possible to quantify relevance across users and queries for a collection of documents. A perfect system would have 100% precision and 100% recall for every user and every query. In other words, it would retrieve all the relevant documents and nothing else. In practical terms, when talking about precision and recall in real systems, it is common to focus on precision and recall at a certain number of results, the most common (and useful) being ten results.
Through faceting, query filters, and other search components, a Solr application can be configured with the flexibility to help users fine-tune their searches in order to return the most relevant results for users. That is, Solr can be configured to balance precision and recall to meet the needs of a particular user community.
The configuration of a Solr application should take into account:
the needs of the application’s various users (which can include ease of use and speed of response, in addition to strictly informational needs)
the categories that are meaningful to these users in their various contexts (e.g., dates, product categories, or regions)
any inherent relevance of documents (e.g., it might make sense to ensure that an official product description or FAQ is always returned near the top of the search results)
whether or not the age of documents matters significantly (in some contexts, the most recent documents might always be the most important)
Keeping all these factors in mind, it’s often helpful in the planning stages of a Solr deployment to sketch out the types of responses you think the search application should return for sample queries. Once the application is up and running, you can employ a series of testing methodologies, such as focus groups, in-house testing, TREC tests, and A/B testing to fine tune the configuration of the application to best meet the needs of its users.
For more information about relevance, see Grant Ingersoll’s blog post Debugging Search Application Relevance Issues.