Class DenseVectorField
-
- All Implemented Interfaces:
FloatValueFieldType
,NumericValueFieldType
public class DenseVectorField extends FloatPointField
Provides a field type to support Lucene'sKnnVectorField
. SeeKnnVectorQuery
for more details. It supports a fixed cardinality dimension for the vector and a fixed similarity function. The default similarity is EUCLIDEAN_HNSW (L2). The default algorithm is HNSW. For Lucene 9.1 e.g. SeeHnswGraph
for more details about the implementation.
OnlyIndexed
andStored
attributes are supported.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.solr.schema.FieldType
FieldType.DefaultAnalyzer, FieldType.MultiValueSelector
-
-
Field Summary
Fields Modifier and Type Field Description static String
DEFAULT_KNN_ALGORITHM
static String
HNSW_ALGORITHM
-
Fields inherited from class org.apache.solr.schema.PointField
TEST_HACK_IGNORE_USELESS_TRIEFIELD_ARGS
-
Fields inherited from class org.apache.solr.schema.NumericFieldType
doubleOrFloat, type
-
Fields inherited from class org.apache.solr.schema.FieldType
ANALYZER, args, AUTO_GENERATE_PHRASE_QUERIES, CHAR_FILTER, CHAR_FILTERS, CLASS_NAME, docValuesFormat, ENABLE_GRAPH_QUERIES, falseProperties, FILTER, FILTERS, INDEX, INDEX_ANALYZER, MULTI_TERM, MULTI_TERM_ANALYZER, POLY_FIELD_SEPARATOR, postingsFormat, properties, QUERY, QUERY_ANALYZER, similarity, SIMILARITY, similarityFactory, SYNONYM_QUERY_STYLE, TOKENIZER, trueProperties, TYPE, TYPE_NAME, typeName
-
Fields inherited from class org.apache.solr.schema.FieldProperties
BINARY, DOC_VALUES, INDEXED, LARGE_FIELD, MULTIVALUED, OMIT_NORMS, OMIT_POSITIONS, OMIT_TF_POSITIONS, REQUIRED, SORT_MISSING_FIRST, SORT_MISSING_LAST, STORE_OFFSETS, STORE_TERMOFFSETS, STORE_TERMPAYLOADS, STORE_TERMPOSITIONS, STORE_TERMVECTORS, STORED, TOKENIZED, UNINVERTIBLE, USE_DOCVALUES_AS_STORED
-
-
Constructor Summary
Constructors Constructor Description DenseVectorField()
DenseVectorField(int dimension)
DenseVectorField(int dimension, org.apache.lucene.index.VectorEncoding vectorEncoding)
DenseVectorField(int dimension, org.apache.lucene.index.VectorSimilarityFunction similarityFunction, org.apache.lucene.index.VectorEncoding vectorEncoding)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
checkSchemaField(SchemaField field)
Check'sSchemaField
instances constructed using this field type to ensure that they are valid.org.apache.lucene.index.IndexableField
createField(SchemaField field, Object vectorValue)
Used for adding a document when a field needs to be created from a type and a string.List<org.apache.lucene.index.IndexableField>
createFields(SchemaField field, Object value)
Given aSchemaField
, create one or moreIndexableField
instancesint
getDimension()
org.apache.lucene.search.Query
getFieldQuery(QParser parser, SchemaField field, String externalVal)
Not Supported.Integer
getHnswBeamWidth()
Integer
getHnswMaxConn()
String
getKnnAlgorithm()
org.apache.lucene.search.Query
getKnnVectorQuery(String fieldName, String vectorToSearch, int topK, org.apache.lucene.search.Query filterQuery)
org.apache.lucene.search.Query
getRangeQuery(QParser parser, SchemaField field, String part1, String part2, boolean minInclusive, boolean maxInclusive)
Not Supportedorg.apache.lucene.index.VectorSimilarityFunction
getSimilarityFunction()
org.apache.lucene.search.SortField
getSortField(SchemaField field, boolean top)
Not SupportedUninvertingReader.Type
getUninversionType(SchemaField sf)
If DocValues is not enabled for a field, but it's indexed, docvalues can be constructed on the fly (uninverted, aka fieldcache) on the first request to sort, facet, etc.org.apache.lucene.queries.function.ValueSource
getValueSource(SchemaField field, QParser parser)
called to get the default value source (normally, from the Lucene FieldCache.)DenseVectorParser
getVectorBuilder(Object inputValue, DenseVectorParser.BuilderPhase phase)
Index Time Parsing The inputValue is an ArrayList with a type that depends on the loader used: -XMLLoader
,CSVLoader
produces an ArrayList of String -JsonLoader
produces an ArrayList of Double -JavabinLoader
produces an ArrayList of Floatorg.apache.lucene.index.VectorEncoding
getVectorEncoding()
void
init(IndexSchema schema, Map<String,String> args)
NOTE: This method can be removed completely whenPointField.TEST_HACK_IGNORE_USELESS_TRIEFIELD_ARGS
is removedObject
toObject(org.apache.lucene.index.IndexableField f)
Convert the stored-field format to an external object.-
Methods inherited from class org.apache.solr.schema.FloatPointField
getExactQuery, getPointRangeQuery, getSetQuery, getSingleValueSource, getStoredField, indexedToReadable, readableToIndexed, toNativeType, toObject
-
Methods inherited from class org.apache.solr.schema.PointField
getPrefixQuery, getSingleValueSource, getSpecializedRangeQuery, indexedToReadable, indexedToReadable, isFieldUsed, isPointField, isTokenized, multiValuedFieldCache, storedToIndexed, storedToReadable, toInternal, toInternalByteRef, write
-
Methods inherited from class org.apache.solr.schema.NumericFieldType
getDocValuesRangeQuery, getNumberType, getRangeQueryForFloatDoubleDocValues, getRangeQueryForMultiValuedDoubleDocValues, getRangeQueryForMultiValuedFloatDocValues, getSpecializedExistenceQuery, numericDocValuesRangeQuery, treatUnboundedRangeAsExistence
-
Methods inherited from class org.apache.solr.schema.PrimitiveFieldType
checkSupportsDocValues, getDefaultMultiValueSelectorForSort
-
Methods inherited from class org.apache.solr.schema.FieldType
createField, getAnalyzerProperties, getClassArg, getDocValuesFormat, getExistenceQuery, getFieldTermQuery, getIndexAnalyzer, getNamedPropertyValues, getNonFieldPropertyArgs, getNumericSort, getPostingsFormat, getQueryAnalyzer, getRewriteMethod, getSimilarity, getSimilarityFactory, getSortedNumericSortField, getSortedSetSortField, getSortField, getStringSort, getTypeName, hasProperty, isExplicitAnalyzer, isExplicitQueryAnalyzer, isMultiValued, isPolyField, isUtf8Field, marshalBase64SortValue, marshalSortValue, marshalStringSortValue, readableToIndexed, restrictProps, setArgs, setIndexAnalyzer, setIsExplicitAnalyzer, setIsExplicitQueryAnalyzer, setQueryAnalyzer, setSimilarity, supportsAnalyzers, toExternal, toString, unmarshalBase64SortValue, unmarshalSortValue, unmarshalStringSortValue, useDocValuesAsStored, write
-
-
-
-
Field Detail
-
HNSW_ALGORITHM
public static final String HNSW_ALGORITHM
- See Also:
- Constant Field Values
-
DEFAULT_KNN_ALGORITHM
public static final String DEFAULT_KNN_ALGORITHM
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
DenseVectorField
public DenseVectorField()
-
DenseVectorField
public DenseVectorField(int dimension)
-
DenseVectorField
public DenseVectorField(int dimension, org.apache.lucene.index.VectorEncoding vectorEncoding)
-
DenseVectorField
public DenseVectorField(int dimension, org.apache.lucene.index.VectorSimilarityFunction similarityFunction, org.apache.lucene.index.VectorEncoding vectorEncoding)
-
-
Method Detail
-
init
public void init(IndexSchema schema, Map<String,String> args)
Description copied from class:PointField
NOTE: This method can be removed completely whenPointField.TEST_HACK_IGNORE_USELESS_TRIEFIELD_ARGS
is removed- Overrides:
init
in classPointField
-
getDimension
public int getDimension()
-
getSimilarityFunction
public org.apache.lucene.index.VectorSimilarityFunction getSimilarityFunction()
-
getKnnAlgorithm
public String getKnnAlgorithm()
-
getHnswMaxConn
public Integer getHnswMaxConn()
-
getHnswBeamWidth
public Integer getHnswBeamWidth()
-
getVectorEncoding
public org.apache.lucene.index.VectorEncoding getVectorEncoding()
-
checkSchemaField
public void checkSchemaField(SchemaField field) throws org.apache.solr.common.SolrException
Description copied from class:FieldType
Check'sSchemaField
instances constructed using this field type to ensure that they are valid.This method is called by the
SchemaField
constructor to check that its initialization does not violate any fundamental requirements of theFieldType
. Subclasses may choose to throw aSolrException
if invariants are violated by theSchemaField.
- Overrides:
checkSchemaField
in classFieldType
- Throws:
org.apache.solr.common.SolrException
-
createFields
public List<org.apache.lucene.index.IndexableField> createFields(SchemaField field, Object value)
Description copied from class:FieldType
Given aSchemaField
, create one or moreIndexableField
instances- Overrides:
createFields
in classPointField
- Parameters:
field
- theSchemaField
value
- The value to add to the field- Returns:
- An array of
IndexableField
- See Also:
FieldType.createField(SchemaField, Object)
,FieldType.isPolyField()
-
createField
public org.apache.lucene.index.IndexableField createField(SchemaField field, Object vectorValue)
Description copied from class:FieldType
Used for adding a document when a field needs to be created from a type and a string.By default, the indexed value is the same as the stored value (taken from toInternal()). Having a different representation for external, internal, and indexed would present quite a few problems given the current Lucene architecture. An analyzer for adding docs would need to translate internal->indexed while an analyzer for querying would need to translate external->indexed.
The only other alternative to having internal==indexed would be to have internal==external. In this case, toInternal should convert to the indexed representation, toExternal() should do nothing, and createField() should *not* call toInternal, but use the external value and set tokenized=true to get Lucene to convert to the internal(indexed) form. :TODO: clean up and clarify this explanation.
- Overrides:
createField
in classFloatPointField
- See Also:
FieldType.toInternal(java.lang.String)
-
toObject
public Object toObject(org.apache.lucene.index.IndexableField f)
Description copied from class:FieldType
Convert the stored-field format to an external object.- Overrides:
toObject
in classFloatPointField
- See Also:
FieldType.toInternal(java.lang.String)
-
getVectorBuilder
public DenseVectorParser getVectorBuilder(Object inputValue, DenseVectorParser.BuilderPhase phase)
Index Time Parsing The inputValue is an ArrayList with a type that depends on the loader used: -XMLLoader
,CSVLoader
produces an ArrayList of String -JsonLoader
produces an ArrayList of Double -JavabinLoader
produces an ArrayList of Float
-
getUninversionType
public UninvertingReader.Type getUninversionType(SchemaField sf)
Description copied from class:FieldType
If DocValues is not enabled for a field, but it's indexed, docvalues can be constructed on the fly (uninverted, aka fieldcache) on the first request to sort, facet, etc. This specifies the structure to use.This method will not be used if the field is (effectively)
uninvertible="false"
- Overrides:
getUninversionType
in classFloatPointField
- Parameters:
sf
- field instance- Returns:
- type to uninvert, or
null
(to disallow uninversion for the field) - See Also:
SchemaField.isUninvertible()
-
getValueSource
public org.apache.lucene.queries.function.ValueSource getValueSource(SchemaField field, QParser parser)
Description copied from class:FieldType
called to get the default value source (normally, from the Lucene FieldCache.)- Overrides:
getValueSource
in classFloatPointField
-
getKnnVectorQuery
public org.apache.lucene.search.Query getKnnVectorQuery(String fieldName, String vectorToSearch, int topK, org.apache.lucene.search.Query filterQuery)
-
getFieldQuery
public org.apache.lucene.search.Query getFieldQuery(QParser parser, SchemaField field, String externalVal)
Not Supported. Please use the {!knn} query parser to run K nearest neighbors search queries.- Overrides:
getFieldQuery
in classPointField
- Parameters:
parser
- TheQParser
calling the methodfield
- TheSchemaField
of the field to searchexternalVal
- The String representation of the value to search- Returns:
- The
Query
instance. This implementation returns aTermQuery
but overriding queries may not
-
getRangeQuery
public org.apache.lucene.search.Query getRangeQuery(QParser parser, SchemaField field, String part1, String part2, boolean minInclusive, boolean maxInclusive)
Not Supported- Overrides:
getRangeQuery
in classFieldType
- Parameters:
parser
- theQParser
calling the methodfield
- the schema fieldpart1
- the lower boundary of the range, nulls are allowed.part2
- the upper boundary of the range, nulls are alloweminInclusive
- whether the minimum of the range is inclusive or notmaxInclusive
- whether the maximum of the range is inclusive or not- Returns:
- a Query instance to perform range search according to given parameters
-
getSortField
public org.apache.lucene.search.SortField getSortField(SchemaField field, boolean top)
Not Supported
-
-