Class SweetSpotSimilarityFactory


  • public class SweetSpotSimilarityFactory
    extends ClassicSimilarityFactory

    Factory for SweetSpotSimilarity.

    SweetSpotSimilarity is an extension of ClassicSimilarity that provides additional tuning options for specifying the "sweetspot" of optimal tf and lengthNorm values in the source data.

    In addition to the discountOverlaps init param supported by ClassicSimilarityFactory The following sets of init params are supported by this factory:

    • Length Norm Settings:
      • lengthNormMin (int)
      • lengthNormMax (int)
      • lengthNormSteepness (float)
    • Baseline TF Settings:
      • baselineTfBase (float)
      • baselineTfMin (float)
    • Hyperbolic TF Settings:
      • hyperbolicTfMin (float)
      • hyperbolicTfMax (float)
      • hyperbolicTfBase (double)
      • hyperbolicTfOffset (float)

    Note:

    • If any individual settings from one of the above mentioned sets are specified, then all settings from that set must be specified.
    • If Baseline TF settings are specified, then Hyperbolic TF settings are not permitted, and vice versa. (The settings specified will determine whether SweetSpotSimilarity.baselineTf(float) or SweetSpotSimilarity.hyperbolicTf(float) will be used.

    Example usage...

     <!-- using baseline TF -->
     <fieldType name="text_baseline" class="solr.TextField"
                indexed="true" stored="false">
       <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
       <similarity class="solr.SweetSpotSimilarityFactory">
         <!-- TF -->
         <float name="baselineTfMin">6.0</float>
         <float name="baselineTfBase">1.5</float>
         <!-- plateau norm -->
         <int name="lengthNormMin">3</int>
         <int name="lengthNormMax">5</int>
         <float name="lengthNormSteepness">0.5</float>
       </similarity>
     </fieldType>
     
     <!-- using hyperbolic TF -->
     <fieldType name="text_hyperbolic" class="solr.TextField"
                indexed="true" stored="false" >
       <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
       <similarity class="solr.SweetSpotSimilarityFactory">
         <float name="hyperbolicTfMin">3.3</float>
         <float name="hyperbolicTfMax">7.7</float>
         <double name="hyperbolicTfBase">2.718281828459045</double> <!-- e -->
         <float name="hyperbolicTfOffset">5.0</float>
         <!-- plateau norm, shallower slope -->
         <int name="lengthNormMin">1</int>
         <int name="lengthNormMax">5</int>
         <float name="lengthNormSteepness">0.2</float>
       </similarity>
     </fieldType>
     
    See Also:
    The javadocs for the individual methods in SweetSpotSimilarity for SVG diagrams showing how the each function behaves with various settings/inputs.