Class LegacyNumericUtils
To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. This reduces the number of terms dramatically.
This class generates terms to achieve this: First the numerical integer values need to be
converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are
converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original
integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by
the shift value (number of bits removed) used during encoding.
For easy usage, the trie algorithm is implemented for indexing inside LegacyNumericTokenStream that can index int, long
, float, and double. For querying, LegacyNumericRangeQuery implements the query part for the same data
types.
- Since:
- 2.9, API changed non backwards-compliant in 4.0
- NOTE: This API is for internal purposes only and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classDeprecated.static classDeprecated. -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intDeprecated.The maximum term length (used forbyte[]buffer size) for encodingintvalues.static final intDeprecated.The maximum term length (used forbyte[]buffer size) for encodinglongvalues.static final intDeprecated.The default precision step used byLegacyLongField,LegacyDoubleField,LegacyNumericTokenStream,LegacyNumericRangeQuery.static final intDeprecated.The default precision step used byLegacyIntFieldandLegacyFloatField.static final byteDeprecated.Integers are stored at lower precision by shifting off lower bits.static final byteDeprecated.Longs are stored at lower precision by shifting off lower bits. -
Method Summary
Modifier and TypeMethodDescriptionstatic org.apache.lucene.index.TermsEnumfilterPrefixCodedInts(org.apache.lucene.index.TermsEnum termsEnum) Deprecated.Filters the givenTermsEnumby accepting only prefix coded 32 bit terms with a shift value of0.static org.apache.lucene.index.TermsEnumfilterPrefixCodedLongs(org.apache.lucene.index.TermsEnum termsEnum) Deprecated.Filters the givenTermsEnumby accepting only prefix coded 64 bit terms with a shift value of0.static IntegergetMaxInt(org.apache.lucene.index.Terms terms) Deprecated.Returns the maximum int value indexed into this numeric field or null if no terms exist.static LonggetMaxLong(org.apache.lucene.index.Terms terms) Deprecated.Returns the maximum long value indexed into this numeric field or null if no terms exist.static IntegergetMinInt(org.apache.lucene.index.Terms terms) Deprecated.Returns the minimum int value indexed into this numeric field or null if no terms exist.static LonggetMinLong(org.apache.lucene.index.Terms terms) Deprecated.Returns the minimum long value indexed into this numeric field or null if no terms exist.static intgetPrefixCodedIntShift(org.apache.lucene.util.BytesRef val) Deprecated.Returns the shift value from a prefix encodedint.static intgetPrefixCodedLongShift(org.apache.lucene.util.BytesRef val) Deprecated.Returns the shift value from a prefix encodedlong.static voidintToPrefixCoded(int val, int shift, org.apache.lucene.util.BytesRefBuilder bytes) Deprecated.Returns prefix coded bits after reducing the precision byshiftbits.static voidlongToPrefixCoded(long val, int shift, org.apache.lucene.util.BytesRefBuilder bytes) Deprecated.Returns prefix coded bits after reducing the precision byshiftbits.static intprefixCodedToInt(org.apache.lucene.util.BytesRef val) Deprecated.Returns an int from prefixCoded bytes.static longprefixCodedToLong(org.apache.lucene.util.BytesRef val) Deprecated.Returns a long from prefixCoded bytes.static voidsplitIntRange(LegacyNumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound) Deprecated.Splits an int range recursively.static voidsplitLongRange(LegacyNumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound) Deprecated.Splits a long range recursively.
-
Field Details
-
PRECISION_STEP_DEFAULT
public static final int PRECISION_STEP_DEFAULTDeprecated.The default precision step used byLegacyLongField,LegacyDoubleField,LegacyNumericTokenStream,LegacyNumericRangeQuery.- See Also:
-
PRECISION_STEP_DEFAULT_32
public static final int PRECISION_STEP_DEFAULT_32Deprecated.The default precision step used byLegacyIntFieldandLegacyFloatField.- See Also:
-
SHIFT_START_LONG
public static final byte SHIFT_START_LONGDeprecated.Longs are stored at lower precision by shifting off lower bits. The shift count is stored asSHIFT_START_LONG+shiftin the first byte- See Also:
-
BUF_SIZE_LONG
public static final int BUF_SIZE_LONGDeprecated.The maximum term length (used forbyte[]buffer size) for encodinglongvalues.- See Also:
-
SHIFT_START_INT
public static final byte SHIFT_START_INTDeprecated.Integers are stored at lower precision by shifting off lower bits. The shift count is stored asSHIFT_START_INT+shiftin the first byte- See Also:
-
BUF_SIZE_INT
public static final int BUF_SIZE_INTDeprecated.The maximum term length (used forbyte[]buffer size) for encodingintvalues.- See Also:
-
-
Method Details
-
longToPrefixCoded
public static void longToPrefixCoded(long val, int shift, org.apache.lucene.util.BytesRefBuilder bytes) Deprecated.Returns prefix coded bits after reducing the precision byshiftbits. This is method is used byLegacyNumericTokenStream. After encoding,bytes.offsetwill always be 0.- Parameters:
val- the numeric valueshift- how many bits to strip from the rightbytes- will contain the encoded value
-
intToPrefixCoded
public static void intToPrefixCoded(int val, int shift, org.apache.lucene.util.BytesRefBuilder bytes) Deprecated.Returns prefix coded bits after reducing the precision byshiftbits. This is method is used byLegacyNumericTokenStream. After encoding,bytes.offsetwill always be 0.- Parameters:
val- the numeric valueshift- how many bits to strip from the rightbytes- will contain the encoded value
-
getPrefixCodedLongShift
public static int getPrefixCodedLongShift(org.apache.lucene.util.BytesRef val) Deprecated.Returns the shift value from a prefix encodedlong.- Throws:
NumberFormatException- if the suppliedBytesRefis not correctly prefix encoded.
-
getPrefixCodedIntShift
public static int getPrefixCodedIntShift(org.apache.lucene.util.BytesRef val) Deprecated.Returns the shift value from a prefix encodedint.- Throws:
NumberFormatException- if the suppliedBytesRefis not correctly prefix encoded.
-
prefixCodedToLong
public static long prefixCodedToLong(org.apache.lucene.util.BytesRef val) Deprecated.Returns a long from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.- Throws:
NumberFormatException- if the suppliedBytesRefis not correctly prefix encoded.- See Also:
-
prefixCodedToInt
public static int prefixCodedToInt(org.apache.lucene.util.BytesRef val) Deprecated.Returns an int from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value.- Throws:
NumberFormatException- if the suppliedBytesRefis not correctly prefix encoded.- See Also:
-
splitLongRange
public static void splitLongRange(LegacyNumericUtils.LongRangeBuilder builder, int precisionStep, long minBound, long maxBound) Deprecated.Splits a long range recursively. You may implement a builder that adds clauses to aBooleanQueryfor each call to itsLegacyNumericUtils.LongRangeBuilder.addRange(BytesRef,BytesRef)method.This method is used by
LegacyNumericRangeQuery. -
splitIntRange
public static void splitIntRange(LegacyNumericUtils.IntRangeBuilder builder, int precisionStep, int minBound, int maxBound) Deprecated.Splits an int range recursively. You may implement a builder that adds clauses to aBooleanQueryfor each call to itsLegacyNumericUtils.IntRangeBuilder.addRange(BytesRef,BytesRef)method.This method is used by
LegacyNumericRangeQuery. -
filterPrefixCodedLongs
public static org.apache.lucene.index.TermsEnum filterPrefixCodedLongs(org.apache.lucene.index.TermsEnum termsEnum) Deprecated.Filters the givenTermsEnumby accepting only prefix coded 64 bit terms with a shift value of0.- Parameters:
termsEnum- the terms enum to filter- Returns:
- a filtered
TermsEnumthat only returns prefix coded 64 bit terms with a shift value of0.
-
filterPrefixCodedInts
public static org.apache.lucene.index.TermsEnum filterPrefixCodedInts(org.apache.lucene.index.TermsEnum termsEnum) Deprecated.Filters the givenTermsEnumby accepting only prefix coded 32 bit terms with a shift value of0.- Parameters:
termsEnum- the terms enum to filter- Returns:
- a filtered
TermsEnumthat only returns prefix coded 32 bit terms with a shift value of0.
-
getMinInt
Deprecated.Returns the minimum int value indexed into this numeric field or null if no terms exist.- Throws:
IOException
-
getMaxInt
Deprecated.Returns the maximum int value indexed into this numeric field or null if no terms exist.- Throws:
IOException
-
getMinLong
Deprecated.Returns the minimum long value indexed into this numeric field or null if no terms exist.- Throws:
IOException
-
getMaxLong
Deprecated.Returns the maximum long value indexed into this numeric field or null if no terms exist.- Throws:
IOException
-
PointValuesinstead.