Class DenseVectorField

All Implemented Interfaces:
FloatValueFieldType, NumericValueFieldType
Direct Known Subclasses:
BinaryQuantizedDenseVectorField, ScalarQuantizedDenseVectorField

public class DenseVectorField extends FloatPointField
Provides a field type to support Lucene's KnnByteVectorField and KnnFloatVectorField. See KnnByteVectorQuery and KnnFloatVectorQuery for more details. It supports a fixed cardinality dimension for the vector and a fixed similarity function. The default similarity is EUCLIDEAN_HNSW (L2). The default algorithm is HNSW. For Lucene 9.1 e.g. See HnswGraph for more details about the implementation.
Only Indexed and Stored attributes are supported.
  • Field Details

  • Constructor Details

    • DenseVectorField

      public DenseVectorField()
    • DenseVectorField

      public DenseVectorField(int dimension)
    • DenseVectorField

      public DenseVectorField(int dimension, org.apache.lucene.index.VectorEncoding vectorEncoding)
    • DenseVectorField

      public DenseVectorField(int dimension, org.apache.lucene.index.VectorSimilarityFunction similarityFunction, org.apache.lucene.index.VectorEncoding vectorEncoding)
  • Method Details

    • init

      public void init(IndexSchema schema, Map<String,String> args)
      Description copied from class: PointField
      NOTE: This method can be removed completely when PointField.TEST_HACK_IGNORE_USELESS_TRIEFIELD_ARGS is removed
      Overrides:
      init in class PointField
    • getDimension

      public int getDimension()
    • getSimilarityFunction

      public org.apache.lucene.index.VectorSimilarityFunction getSimilarityFunction()
    • getKnnAlgorithm

      public String getKnnAlgorithm()
    • getHnswMaxConn

      @Deprecated public Integer getHnswMaxConn()
      Deprecated.
    • getHnswBeamWidth

      @Deprecated public Integer getHnswBeamWidth()
      Deprecated.
    • getHnswM

      public Integer getHnswM()
    • getHnswEfConstruction

      public Integer getHnswEfConstruction()
    • getVectorEncoding

      public org.apache.lucene.index.VectorEncoding getVectorEncoding()
    • getCuvsWriterThreads

      public int getCuvsWriterThreads()
    • getCuvsIntGraphDegree

      public int getCuvsIntGraphDegree()
    • getCuvsGraphDegree

      public int getCuvsGraphDegree()
    • getCuvsHnswLayers

      public int getCuvsHnswLayers()
    • getCuvsHnswMaxConn

      public int getCuvsHnswMaxConn()
    • getCuvsHnswEfConstruction

      public int getCuvsHnswEfConstruction()
    • enableDocValuesByDefault

      protected boolean enableDocValuesByDefault()
      Description copied from class: FieldType
      Returns whether this field type should enable docValues by default for schemaVersion >= 1.7. This should not be enabled for fields that did not have docValues implemented by Solr 9.7, as users may have indexed documents without docValues (since they weren't supported). Flipping the default docValues values when they upgrade to a new version will break their index compatibility.

      New field types can enable this without issue, as long as they support docValues.

      Overrides:
      enableDocValuesByDefault in class PrimitiveFieldType
    • checkSchemaField

      public void checkSchemaField(SchemaField field) throws org.apache.solr.common.SolrException
      Description copied from class: FieldType
      Check's SchemaField instances constructed using this field type to ensure that they are valid.

      This method is called by the SchemaField constructor to check that its initialization does not violate any fundamental requirements of the FieldType. Subclasses may choose to throw a SolrException if invariants are violated by the SchemaField.

      Overrides:
      checkSchemaField in class FieldType
      Throws:
      org.apache.solr.common.SolrException
    • createFields

      public List<org.apache.lucene.index.IndexableField> createFields(SchemaField field, Object value)
      Description copied from class: FieldType
      Given a SchemaField, create one or more IndexableField instances
      Overrides:
      createFields in class PointField
      Parameters:
      field - the SchemaField
      value - The value to add to the field
      Returns:
      An array of IndexableField
      See Also:
    • createField

      public org.apache.lucene.index.IndexableField createField(SchemaField field, Object vectorValue)
      Description copied from class: FieldType
      Used for adding a document when a field needs to be created from a type and a string.

      By default, the indexed value is the same as the stored value (taken from toInternal()). Having a different representation for external, internal, and indexed would present quite a few problems given the current Lucene architecture. An analyzer for adding docs would need to translate internal->indexed while an analyzer for querying would need to translate external->indexed.

      The only other alternative to having internal==indexed would be to have internal==external. In this case, toInternal should convert to the indexed representation, toExternal() should do nothing, and createField() should *not* call toInternal, but use the external value and set tokenized=true to get Lucene to convert to the internal(indexed) form. :TODO: clean up and clarify this explanation.

      Overrides:
      createField in class FloatPointField
      See Also:
    • toObject

      public Object toObject(org.apache.lucene.index.IndexableField f)
      Description copied from class: FieldType
      Convert the stored-field format to an external object.
      Overrides:
      toObject in class FloatPointField
      See Also:
    • getVectorBuilder

      public DenseVectorParser getVectorBuilder(Object inputValue, DenseVectorParser.BuilderPhase phase)
      Index Time Parsing The inputValue is an ArrayList with a type that depends on the loader used: - XMLLoader, CSVLoader produces an ArrayList of String - JsonLoader produces an ArrayList of Double - JavabinLoader produces an ArrayList of Float
    • buildKnnVectorsFormat

      public org.apache.lucene.codecs.KnnVectorsFormat buildKnnVectorsFormat()
    • getUninversionType

      public UninvertingReader.Type getUninversionType(SchemaField sf)
      Description copied from class: FieldType
      If DocValues is not enabled for a field, but it's indexed, docvalues can be constructed on the fly (uninverted, aka fieldcache) on the first request to sort, facet, etc. This specifies the structure to use.

      This method will not be used if the field is (effectively) uninvertible="false"

      Overrides:
      getUninversionType in class FloatPointField
      Parameters:
      sf - field instance
      Returns:
      type to uninvert, or null (to disallow uninversion for the field)
      See Also:
    • getValueSource

      public org.apache.lucene.queries.function.ValueSource getValueSource(SchemaField field, QParser parser)
      Description copied from class: FieldType
      called to get the default value source (normally, from the Lucene FieldCache.)
      Overrides:
      getValueSource in class FloatPointField
    • getKnnVectorQuery

      public org.apache.lucene.search.Query getKnnVectorQuery(String fieldName, String vectorToSearch, int topK, int efSearch, org.apache.lucene.search.Query filterQuery, org.apache.lucene.search.Query seedQuery, KnnQParser.EarlyTerminationParams earlyTermination, Integer filteredSearchThreshold)
    • getFieldQuery

      public org.apache.lucene.search.Query getFieldQuery(QParser parser, SchemaField field, String externalVal)
      Not Supported. Please use the {!knn} query parser to run K nearest neighbors search queries.
      Overrides:
      getFieldQuery in class PointField
      Parameters:
      parser - The QParser calling the method
      field - The SchemaField of the field to search
      externalVal - The String representation of the value to search
      Returns:
      The Query instance. This implementation returns a TermQuery but overriding queries may not
    • getRangeQuery

      public org.apache.lucene.search.Query getRangeQuery(QParser parser, SchemaField field, String part1, String part2, boolean minInclusive, boolean maxInclusive)
      Not Supported
      Overrides:
      getRangeQuery in class FieldType
      Parameters:
      parser - the QParser calling the method
      field - the schema field
      part1 - the lower boundary of the range, nulls are allowed.
      part2 - the upper boundary of the range, nulls are allowe
      minInclusive - whether the minimum of the range is inclusive or not
      maxInclusive - whether the maximum of the range is inclusive or not
      Returns:
      a Query instance to perform range search according to given parameters
    • getSortField

      public org.apache.lucene.search.SortField getSortField(SchemaField field, boolean top)
      Not Supported
      Overrides:
      getSortField in class PointField
      See Also: