public class PhrasesIdentificationComponent extends SearchComponent
QueryComponent to identify 
 & score "phrases" found in the input string, based on shingles in indexed fields.
 
 The most common way to use this component is in conjunction with field that use 
 ShingleFilterFactory on both the index and query analyzers.  
 An example field type configuration would be something like this...
 
 <fieldType name="phrases" class="solr.TextField" positionIncrementGap="100">
   <analyzer type="index">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="3" outputUnigrams="true"/>
   </analyzer>
   <analyzer type="query">
     <tokenizer class="solr.StandardTokenizerFactory"/>
     <filter class="solr.LowerCaseFilterFactory"/>
     <filter class="solr.ShingleFilterFactory" minShingleSize="2" maxShingleSize="7" outputUnigramsIfNoShingles="true" outputUnigrams="true"/>
   </analyzer>
 </fieldType>
 
 
 ...where the query analyzer's maxShingleSize="7" determines the maximum 
 possible phrase length that can be hueristically deduced, the index analyzer's 
 maxShingleSize="3" determines the accuracy of phrases identified.  The large the 
 indexed maxShingleSize the higher the accuracy.  Both analyzers must include 
 minShingleSize="2" outputUnigrams="true".
 
 With a field type like this, one or more fields can be specified (with weights) via a 
 phrases.fields param to request that this component identify possible phrases in the 
 input q param, or an alternative phrases.q override param.  The identified
 phrases will include their scores relative each field specified, as well an overal weighted score based
 on the field weights provided by the client.  Higher score values indicate a greater confidence in the 
 Phrase.
 
 NOTE: In a distributed request, this component uses a single phase (piggy backing on the 
 ShardRequest.PURPOSE_GET_TOP_IDS generated by QueryComponent if it is in use) to 
 collect all field & shingle stats.  No "refinement" requests are used.
 
| Modifier and Type | Class and Description | 
|---|---|
| static class  | PhrasesIdentificationComponent.PhraseModel the data known about a single (candidate) Phrase -- which may or may not be indexed | 
| static class  | PhrasesIdentificationComponent.PhrasesContextDataSimple container for all request options and data this component needs to store in the Request Context | 
SolrInfoBean.Category, SolrInfoBean.Group| Modifier and Type | Field and Description | 
|---|---|
| static String | COMPONENT_NAMEName, also used as a request param to identify whether the user query concerns this component | 
| static String | PHRASE_ANALYSIS_FIELD | 
| static String | PHRASE_FIELDS | 
| static String | PHRASE_INDEX_MAXLEN | 
| static String | PHRASE_INPUT | 
| static String | PHRASE_QUERY_MAXLEN | 
| static String | PHRASE_SUMMARY_POST | 
| static String | PHRASE_SUMMARY_PRE | 
| static int | SHARD_PURPOSEThe only shard purpose that will cause this component to do work & return data during shard req | 
metricNames, registry, standard_components| Constructor and Description | 
|---|
| PhrasesIdentificationComponent() | 
| Modifier and Type | Method and Description | 
|---|---|
| int | distributedProcess(ResponseBuilder rb)Process for a distributed search. | 
| void | finishStage(ResponseBuilder rb)Called after all responses have been received for this stage. | 
| String | getDescription()Simple one or two line description | 
| static int | getMaxShingleSize(Analyzer analyzer)Helper method, public for testing purposes only. | 
| void | prepare(ResponseBuilder rb)Prepare the response. | 
| void | process(ResponseBuilder rb)Process the request for this component | 
getCategory, getMetricNames, getMetricRegistry, getName, handleResponses, init, modifyRequest, setNameclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitgetMetricsSnapshot, registerMetricNamepublic static final int SHARD_PURPOSE
public static final String COMPONENT_NAME
public static final String PHRASE_INPUT
public static final String PHRASE_FIELDS
public static final String PHRASE_ANALYSIS_FIELD
public static final String PHRASE_SUMMARY_PRE
public static final String PHRASE_SUMMARY_POST
public static final String PHRASE_INDEX_MAXLEN
public static final String PHRASE_QUERY_MAXLEN
public void prepare(ResponseBuilder rb) throws IOException
SearchComponentSearchComponent.process(org.apache.solr.handler.component.ResponseBuilder) method.
 Called for every incoming request.
 The place to do initialization that is request dependent.prepare in class SearchComponentrb - The ResponseBuilderIOException - If there is a low-level I/O error.public int distributedProcess(ResponseBuilder rb)
SearchComponentdistributedProcess in class SearchComponentpublic void finishStage(ResponseBuilder rb)
SearchComponentfinishStage in class SearchComponentpublic void process(ResponseBuilder rb) throws IOException
SearchComponentprocess in class SearchComponentrb - The ResponseBuilderIOException - If there is a low-level I/O error.public String getDescription()
SolrInfoBeangetDescription in interface SolrInfoBeangetDescription in class SearchComponentpublic static int getMaxShingleSize(Analyzer analyzer)
Given an analyzer, inspects it to determine if:
TokenizerChainShingleFilterFactory
 If these these conditions are met, then this method returns the maxShingleSize 
 in effect for this analyzer, otherwise returns -1.
 
analyzer - An analyzer inspectmaxShingleSize if availableCopyright © 2000-2019 Apache Software Foundation. All Rights Reserved.