Class UnInvertedField
- java.lang.Object
-
- org.apache.solr.uninverting.DocTermOrds
-
- org.apache.solr.search.facet.UnInvertedField
-
- All Implemented Interfaces:
org.apache.lucene.util.Accountable
public class UnInvertedField extends DocTermOrds
Final form of the un-inverted field: Each document points to a list of term numbers that are contained in that document.Term numbers are in sorted order, and are encoded as variable-length deltas from the previous term number. Real term numbers start at 2 since 0 and 1 are reserved. A term number of 0 signals the end of the termNumber list.
There is a single int[maxDoc()] which either contains a pointer into a byte[] for the termNumber lists, or directly contains the termNumber list if it fits in the 4 bytes of an integer. If the first byte in the integer is 1, the next 3 bytes are a pointer into a byte[] where the termNumber list starts.
There are actually 256 byte arrays, to compensate for the fact that the pointers into the byte arrays are only 3 bytes long. The correct byte array for a document is a function of its id.
To save space and speed up faceting, any term that matches enough documents will not be un-inverted... it will be skipped while building the un-inverted field structure, and will use a set intersection method during faceting.
To further save memory, the terms (the actual string values) are not all stored in memory, but a TermIndex is used to convert term numbers to term values only for the terms needed after faceting has completed. Only every 128th term value is stored, along with its corresponding term number, and this is used as an index to find the closest term and iterate until the desired number is hit (very much like Lucene's own internal term index).
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
UnInvertedField.Callback
class
UnInvertedField.DocToTerm
-
Field Summary
-
Fields inherited from class org.apache.solr.uninverting.DocTermOrds
checkForDocValues, DEFAULT_INDEX_INTERVAL_BITS, field, index, indexedTermsArray, maxTermDocFreq, numTermsInField, ordBase, phase1_time, postingsEnum, prefix, sizeOfIndexedStrings, termInstances, tnums, total_time
-
-
Constructor Summary
Constructors Constructor Description UnInvertedField(String field, SolrIndexSearcher searcher)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static UnInvertedField
checkUnInvertedField(String field, SolrIndexSearcher searcher)
void
collectDocs(org.apache.solr.search.facet.FacetFieldProcessorByArrayUIF processor)
void
collectDocsGeneric(org.apache.solr.search.facet.FacetFieldProcessorByArrayUIF processor)
int
getNumTerms()
static UnInvertedField
getUnInvertedField(String field, SolrIndexSearcher searcher)
long
memSize()
protected void
setActualDocFreq(int termNum, int docFreq)
Invoked duringDocTermOrds.uninvert(org.apache.lucene.index.LeafReader,Bits,BytesRef)
to record the document frequency for each uninverted term.String
toString()
protected void
visitTerm(org.apache.lucene.index.TermsEnum te, int termNum)
Called for each term in the field being uninverted.-
Methods inherited from class org.apache.solr.uninverting.DocTermOrds
getOrdTermsEnum, isEmpty, iterator, lookupTerm, numTerms, ramBytesUsed, uninvert
-
-
-
-
Constructor Detail
-
UnInvertedField
public UnInvertedField(String field, SolrIndexSearcher searcher) throws IOException
- Throws:
IOException
-
-
Method Detail
-
visitTerm
protected void visitTerm(org.apache.lucene.index.TermsEnum te, int termNum) throws IOException
Called for each term in the field being uninverted. CollectsmaxTermCounts
for all bigTerms as well as storing them inbigTerms
.- Overrides:
visitTerm
in classDocTermOrds
- Parameters:
te
- positioned at the current term.termNum
- the ID/pointer/ordinal of the current term. Monotonically increasing between calls.- Throws:
IOException
-
setActualDocFreq
protected void setActualDocFreq(int termNum, int docFreq)
Description copied from class:DocTermOrds
Invoked duringDocTermOrds.uninvert(org.apache.lucene.index.LeafReader,Bits,BytesRef)
to record the document frequency for each uninverted term.- Overrides:
setActualDocFreq
in classDocTermOrds
-
memSize
public long memSize()
-
getNumTerms
public int getNumTerms()
-
collectDocs
public void collectDocs(org.apache.solr.search.facet.FacetFieldProcessorByArrayUIF processor) throws IOException
- Throws:
IOException
-
collectDocsGeneric
public void collectDocsGeneric(org.apache.solr.search.facet.FacetFieldProcessorByArrayUIF processor) throws IOException
- Throws:
IOException
-
getUnInvertedField
public static UnInvertedField getUnInvertedField(String field, SolrIndexSearcher searcher) throws IOException
- Throws:
IOException
-
checkUnInvertedField
public static UnInvertedField checkUnInvertedField(String field, SolrIndexSearcher searcher) throws IOException
- Throws:
IOException
-
-