Terms Component
The Terms Component provides access to the indexed terms in a field and the number of documents that match each term. This can be useful for building an auto-suggest feature or any other feature that operates at the term level instead of the search or document level. Retrieving terms in index order is very fast since the implementation directly uses Lucene’s TermEnum to iterate over the term dictionary.
In a sense, this search component provides fast field-faceting over the whole index, not restricted by the base query or any filters. The document frequencies returned are the number of documents that match the term, including any documents that have been marked for deletion but not yet removed from the index.
Configuring the Terms Component
Terms Component is one of the default search components
and does not need to be defined in solrconfig.xml.
The definition is equivalent to:
<searchComponent name="terms" class="solr.TermsComponent"/>Using the Terms Component in a Request Handler
Solr comes with an implicit request handler definition /terms, which enables (only) Terms component.
If you want to enable Terms component when using another Request Handler, terms=true parameter needs to be passed during the request or be set in the handler’s defaults.
Terms Component Parameters
The parameters below allow you to control what terms are returned. You can also configure any of these with the request handler if you’d like to set them permanently. Or, you can add them to the query request. These parameters are:
- terms
- 
Optional Default: falseIf set to true, enables the Terms Component.Example: terms=true
- terms.fl
- 
Required Default: none Specifies the field from which to retrieve terms. This parameter is required if terms=true. This parameter can be specified multiple times to retrieve terms for different fields.Example: terms.fl=titleExample: terms.fl=title&terms=true&terms.fl=name&terms.list=cable,sony
<response>
   <lst name="responseHeader">
      <bool name="zkConnected">true</bool>
      <int name="status">0</int>
      <int name="QTime">4</int>
   </lst>
   <lst name="terms">
      <lst name="name">
         <int name="cable">8661</int>
         <int name="sony">21</int>
      </lst>
      <lst name="title">
         <int name="cable">11387</int>
         <int name="sony">3921</int>
      </lst>
   </lst>
</response>- terms.list
- 
Optional Default: none Fetches the document frequency for a comma-delimited list of terms. Terms are always returned in index order. If terms.ttfis set totrue, also returns their total term frequency. If multipleterms.flare defined, these statistics will be returned for each term in each requested field.Example: terms.list=termA,termB,termCWhen terms.listis specified, then terms are always sorted byindex. Exceptterms.ttf, none of other terms parameters are supported whenterms.listis specified.
- terms.limit
- 
Optional Default: 10Specifies the maximum number of terms to return. If the limit is set to a number less than 0, then no maximum limit is enforced. Although this is not required, either this parameter orterms.uppermust be defined.Example: terms.limit=20
- terms.lower
- 
Optional Default: see description Specifies the term at which to start. If not specified, the empty string is used, causing Solr to start at the beginning of the field. Example: terms.lower=orange
- terms.lower.incl
- 
Optional Default: trueIf set to true, includes the lower-bound term (specified withterms.lowerin the result set.Example: terms.lower.incl=false
- terms.mincount
- 
Optional Default: 1Specifies the minimum document frequency to return in order for a term to be included in a query response. Results are inclusive of the mincount (that is, >= mincount). Example: terms.mincount=5
- terms.maxcount
- 
Optional Default: -1Specifies the maximum document frequency a term must have in order to be included in a query response. The default setting is -1, which sets no upper bound. Results are inclusive of the maxcount (that is, <= maxcount).Example: terms.maxcount=25
- terms.prefix
- 
Optional Default: none Restricts matches to terms that begin with the specified string. Example: terms.prefix=inter
- terms.raw
- 
Optional Default: falseIf set to true, returns the raw characters of the indexed term, regardless of whether it is human-readable. For instance, the indexed form of numeric numbers is not human-readable.Example: terms.raw=true
- terms.regex
- 
Optional Default: none Restricts matches to terms that match the regular expression. Example: terms.regex=.*pedist
- terms.regex.flag
- 
Optional Default: none Defines a Java regex flag to use when evaluating the regular expression defined with terms.regex. See http://docs.oracle.com/javase/tutorial/essential/regex/pattern.html for details of each flag. Valid options are:- 
case_insensitive
- 
comments
- 
multiline
- 
literal
- 
dotall
- 
unicode_case
- 
canon_eq
- 
unix_linesExample: terms.regex.flag=case_insensitive
 
- 
- terms.stats
- 
Optional Default: falseIf true, include index statistics in the results. Currently returns only the number of documents for a collection. When combined withterms.listit provides enough information to compute inverse document frequency (IDF) for a list of terms.
- terms.sort
- 
Optional Default: countDefines how to sort the terms returned. Valid options are count, which sorts by the term frequency, with the highest term frequency first, orindex, which sorts in index order.Example: terms.sort=index
- terms.ttf
- 
Optional Default: falseIf set to true, returns bothdf(docFreq) andttf(totalTermFreq) statistics for each requested term interms.list. In this case, the response format is:XML: <lst name="terms"> <lst name="field"> <lst name="termA"> <long name="df">22</long> <long name="ttf">73</long> </lst> </lst> </lst>JSON: { "terms": { "field": [ "termA", { "df": 22, "ttf": 73 } ] } }
- terms.upper
- 
Optional Default: none Specifies the term to stop at. Although this parameter is not required, either this parameter or terms.limitmust be defined.Example: terms.upper=plum
- terms.upper.incl
- 
Optional Default: falseIf set to true, the upper bound term is included in the result set. Example: terms.upper.incl=true
The response to a terms request is a list of the terms and their document frequency values.
You may also be interested in the TermsComponent javadoc.
Terms Component Examples
All of the following sample queries work with Solr’s “bin/solr start -e techproducts” example.
Get Top 10 Terms
This query requests the first ten terms in the name field:
http://localhost:8983/solr/techproducts/terms?terms.fl=name&wt=xmlResults:
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">2</int>
  </lst>
  <lst name="terms">
    <lst name="name">
      <int name="one">5</int>
      <int name="184">3</int>
      <int name="1gb">3</int>
      <int name="3200">3</int>
      <int name="400">3</int>
      <int name="ddr">3</int>
      <int name="gb">3</int>
      <int name="ipod">3</int>
      <int name="memory">3</int>
      <int name="pc">3</int>
    </lst>
  </lst>
</response>Get First 10 Terms Starting with Letter 'a'
This query requests the first ten terms in the name field, in index order (instead of the top 10 results by document count):
http://localhost:8983/solr/techproducts/terms?terms.fl=name&terms.lower=a&terms.sort=index&wt=xmlResults:
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">0</int>
  </lst>
  <lst name="terms">
    <lst name="name">
      <int name="a">1</int>
      <int name="all">1</int>
      <int name="apple">1</int>
      <int name="asus">1</int>
      <int name="ata">1</int>
      <int name="ati">1</int>
      <int name="belkin">1</int>
      <int name="black">1</int>
      <int name="british">1</int>
      <int name="cable">1</int>
    </lst>
  </lst>
</response>Using Terms Component in a Request Handler
This query augments a regular search with terms information.
http://localhost:8983/solr/techproducts/select?q=corsair&fl=id,name&rows=1&echoParams=none&wt=xml&terms=true&terms.fl=nameResults (notice that the term counts are not affected by the query):
<response>
<lst name="responseHeader">
  <int name="status">0</int>
  <int name="QTime">1</int>
</lst>
<result name="response" numFound="2" start="0" numFoundExact="true">
  <doc>
    <str name="id">VS1GB400C3</str>
    <str name="name">CORSAIR ValueSelect 1GB 184-Pin DDR SDRAM Unbuffered DDR 400 (PC 3200) System Memory - Retail</str></doc>
</result>
<lst name="terms">
  <lst name="name">
    <int name="one">5</int>
    <int name="184">3</int>
    <int name="1gb">3</int>
    <int name="3200">3</int>
    <int name="400">3</int>
    <int name="ddr">3</int>
    <int name="gb">3</int>
    <int name="ipod">3</int>
    <int name="memory">3</int>
    <int name="pc">3</int>
  </lst>
</lst>
</response>SolrJ Invocation
    SolrQuery query = new SolrQuery();
    query.setRequestHandler("/terms");
    query.setTerms(true);
    query.setTermsLimit(5);
    query.setTermsLower("s");
    query.setTermsPrefix("s");
    query.addTermsField("terms_s");
    query.setTermsMinCount(1);
    QueryRequest request = new QueryRequest(query);
    List<Term> terms = request.process(getSolrClient()).getTermsResponse().getTerms("terms_s");Using the Terms Component for an Auto-Suggest Feature
If the Suggester doesn’t suit your needs, you can use the Terms component in Solr to build a similar feature for your own search application. Simply submit a query specifying whatever characters the user has typed so far as a prefix. For example, if the user has typed "at", the search engine’s interface would submit the following query:
http://localhost:8983/solr/techproducts/terms?terms.fl=name&terms.prefix=at&wt=xmlResult:
<response>
  <lst name="responseHeader">
    <int name="status">0</int>
    <int name="QTime">1</int>
  </lst>
  <lst name="terms">
    <lst name="name">
      <int name="ata">1</int>
      <int name="ati">1</int>
    </lst>
  </lst>
</response>You can use the parameter omitHeader=true to omit the response header from the query response, like in this example, which also returns the response in JSON format:
http://localhost:8983/solr/techproducts/terms?terms.fl=name&terms.prefix=at&omitHeader=trueResult:
{
  "terms": {
    "name": [
      "ata",
      1,
      "ati",
      1
    ]
  }
}