public static final class PhrasesIdentificationComponent.Phrase extends Object
Modifier and Type | Method and Description |
---|---|
static List<PhrasesIdentificationComponent.Phrase> |
extractPhrases(String input,
SchemaField analysisField,
int maxIndexedPositionLength,
int maxQueryPositionLength)
Factory method for constructing a list of Phrases given the specified input and using the analyzer
for the specified field.
|
static List<NamedList<Object>> |
formatShardResponse(List<PhrasesIdentificationComponent.Phrase> phrases)
Format the phrases suitable for returning in a shard response
|
long |
getConjunctionDocCount(String field)
Returns the number of documents that contain all of the
getIndividualIndexedTerms()
that make up this Phrase, in the specified field. |
NamedList |
getDetails() |
long |
getDocFreq(String field)
Returns the number of documents that contain this (indexed) Phrase as term
in the specified field.
|
double |
getFieldScore(String field)
Returns the score for this Phrase in this given field.
|
List<PhrasesIdentificationComponent.Phrase> |
getIndexedSuperPhrases()
Returns all phrases larger then this phrase, which fully include this phrase, and are indexed.
|
List<PhrasesIdentificationComponent.Phrase> |
getIndividualIndexedTerms()
Returns the list of "individual" (ie:
getPositionLength()==1 terms. |
List<PhrasesIdentificationComponent.Phrase> |
getLargestIndexedSubPhrases()
Returns the list of (overlapping) sub phrases that have the largest possible size based on
the effective value of
PhrasesIdentificationComponent.PhrasesContextData.maxIndexedPositionLength . |
int |
getOffsetEnd() |
int |
getOffsetStart() |
int |
getPositionEnd()
NOTE: positions start at '1'
|
int |
getPositionLength() |
BitSet |
getPositionsBitSet()
Each set bit identifies a position filled by this Phrase
|
int |
getPositionStart()
NOTE: positions start at '1'
|
CharSequence |
getSubSequence()
The characters from the original input that corrispond with this Phrase
|
double |
getTotalScore()
Returns the overall score for this Phrase.
|
long |
getTTF(String field)
Returns the number of total TTF of this (indexed) Phrase as term in the specified field.
|
static void |
populateScores(List<PhrasesIdentificationComponent.Phrase> phrases,
Map<String,Double> fieldWeights,
int maxIndexedPositionLength,
int maxQueryPositionLength)
Public for testing purposes
|
static void |
populateScores(PhrasesIdentificationComponent.PhrasesContextData contextData)
Uses the previously popuated stats to populate each Phrase with it's scores for the specified fields,
and it's over all (weighted) total score.
|
static void |
populateStats(List<PhrasesIdentificationComponent.Phrase> phrases,
Collection<String> fieldNames,
SolrIndexSearcher searcher)
Populates the phrases with stats from the local index for the specified fields
|
static void |
populateStats(List<PhrasesIdentificationComponent.Phrase> phrases,
List<NamedList<Object>> shardData)
Populates the phrases with (merged) stats from a remote shard
|
String |
toString() |
public static List<PhrasesIdentificationComponent.Phrase> extractPhrases(String input, SchemaField analysisField, int maxIndexedPositionLength, int maxQueryPositionLength)
maxIndexedPositionLength
and
maxQueryPositionLength
provided *must* match the effective values used by
respective analyzers.public static List<NamedList<Object>> formatShardResponse(List<PhrasesIdentificationComponent.Phrase> phrases)
populateStats(List,List)
public static void populateStats(List<PhrasesIdentificationComponent.Phrase> phrases, List<NamedList<Object>> shardData)
public static void populateStats(List<PhrasesIdentificationComponent.Phrase> phrases, Collection<String> fieldNames, SolrIndexSearcher searcher) throws IOException
IOException
public static void populateScores(PhrasesIdentificationComponent.PhrasesContextData contextData)
public static void populateScores(List<PhrasesIdentificationComponent.Phrase> phrases, Map<String,Double> fieldWeights, int maxIndexedPositionLength, int maxQueryPositionLength)
populateScores(PhrasesIdentificationComponent.PhrasesContextData)
public NamedList getDetails()
public CharSequence getSubSequence()
public List<PhrasesIdentificationComponent.Phrase> getIndividualIndexedTerms()
getPositionLength()==1
terms.
NOTE: Indexed phrases of length 1 are the (sole) individual terms of themselvespublic List<PhrasesIdentificationComponent.Phrase> getLargestIndexedSubPhrases()
PhrasesIdentificationComponent.PhrasesContextData.maxIndexedPositionLength
.
NOTE: Indexed phrases of length less then the max indexed length are the (sole)
largest sub-phrases of themselves.public List<PhrasesIdentificationComponent.Phrase> getIndexedSuperPhrases()
public int getPositionStart()
public int getPositionEnd()
public int getPositionLength()
public BitSet getPositionsBitSet()
public int getOffsetStart()
public int getOffsetEnd()
public double getTotalScore()
public double getFieldScore(String field)
public long getTTF(String field)
populateStats(java.util.List<org.apache.solr.handler.component.PhrasesIdentificationComponent.Phrase>, java.util.List<org.apache.solr.common.util.NamedList<java.lang.Object>>)
methods has been called with this field.public long getConjunctionDocCount(String field)
getIndividualIndexedTerms()
that make up this Phrase, in the specified field.
NOTE: behavior of calling this method is undefined unless one of the populateStats(java.util.List<org.apache.solr.handler.component.PhrasesIdentificationComponent.Phrase>, java.util.List<org.apache.solr.common.util.NamedList<java.lang.Object>>)
methods has been called with this field.public long getDocFreq(String field)
populateStats(java.util.List<org.apache.solr.handler.component.PhrasesIdentificationComponent.Phrase>, java.util.List<org.apache.solr.common.util.NamedList<java.lang.Object>>)
methods has been called with this field.Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.