Class DFRSimilarityFactory

java.lang.Object
org.apache.solr.schema.SimilarityFactory
org.apache.solr.search.similarities.DFRSimilarityFactory

public class DFRSimilarityFactory extends SimilarityFactory
Factory for DFRSimilarity

You must specify the implementations for all three components of DFR (strings). In general the models are parameter-free, but two of the normalizations take floating point parameters (see below):

  1. basicModel: Basic model of information content:
    • G: Geometric approximation of Bose-Einstein
    • I(n): Inverse document frequency
    • I(ne): Inverse expected document frequency [mixture of Poisson and IDF]
    • I(F): Inverse term frequency [approximation of I(ne)]
  2. afterEffect: First normalization of information gain:
    • L: Laplace's law of succession
    • B: Ratio of two Bernoulli processes
  3. normalization: Second (length) normalization:
    • H1: Uniform distribution of term frequency
      • parameter c (float): hyper-parameter that controls the term frequency normalization with respect to the document length. The default is 1
    • H2: term frequency density inversely related to length
      • parameter c (float): hyper-parameter that controls the term frequency normalization with respect to the document length. The default is 1
    • H3: term frequency normalization provided by Dirichlet prior
      • parameter mu (float): smoothing parameter μ. The default is 800
    • Z: term frequency normalization provided by a Zipfian relation
      • parameter z (float): represents A/(A+1) where A measures the specificity of the language. The default is 0.3
    • none: no second normalization

Optional settings:

  • discountOverlaps (bool): Sets Similarity.getDiscountOverlaps()
WARNING: This API is experimental and might change in incompatible ways in the next release.
  • Constructor Details

    • DFRSimilarityFactory

      public DFRSimilarityFactory()
  • Method Details

    • init

      public void init(org.apache.solr.common.params.SolrParams params)
      Overrides:
      init in class SimilarityFactory
    • getSimilarity

      public org.apache.lucene.search.similarities.Similarity getSimilarity()
      Specified by:
      getSimilarity in class SimilarityFactory