Class StatsCache

  • All Implemented Interfaces:
    PluginInfoInitialized
    Direct Known Subclasses:
    ExactStatsCache, LocalStatsCache

    public abstract class StatsCache
    extends Object
    implements PluginInfoInitialized
    This class represents a cache of global document frequency information for selected terms. This information is periodically updated from all shards, either through scheduled events of some kind, or on every request when there is no global stats available for terms involved in the query (or if this information is stale due to changes in the shards).

    There are instances of this class at the aggregator node (where the partial data from shards is aggregated), and on each core involved in a shard request (where this data is maintained and updated from the aggregator's cache).

    • Constructor Detail

      • StatsCache

        public StatsCache()
    • Method Detail

      • retrieveStatsRequest

        public ShardRequest retrieveStatsRequest​(ResponseBuilder rb)
        Creates a ShardRequest to retrieve per-shard stats related to the current query and the current state of the requester's StatsCache.

        This method updates the cache metrics and calls doRetrieveStatsRequest(ResponseBuilder).

        Parameters:
        rb - contains current request
        Returns:
        shard request to retrieve stats for terms in the current request, or null if no additional request is needed (e.g. if the information in global cache is already sufficient to satisfy this request).
      • mergeToGlobalStats

        public void mergeToGlobalStats​(SolrQueryRequest req,
                                       List<ShardResponse> responses)
        Process shard responses that contain partial local stats. Usually this entails combining per-shard stats for each term.

        This method updates the cache metrics and calls doMergeToGlobalStats(SolrQueryRequest, List).

        Parameters:
        req - query request
        responses - responses from shards containing local stats for each shard
      • receiveGlobalStats

        public void receiveGlobalStats​(SolrQueryRequest req)
        Receive global stats data from the leader and update a local cache of global stats with this global data. This event occurs either as a separate request, or together with the regular query request, in which case this method is called first, before preparing a QueryCommand to be submitted to the local SolrIndexSearcher.

        This method updates the cache metrics and calls doReceiveGlobalStats(SolrQueryRequest).

        Parameters:
        req - query request with global stats data
      • doReceiveGlobalStats

        protected abstract void doReceiveGlobalStats​(SolrQueryRequest req)
      • get

        public StatsSource get​(SolrQueryRequest req)
        Prepare a StatsSource that provides stats information to perform local scoring (to be precise, to build a local Weight from the query).

        This method updates the cache metrics and calls doGet(SolrQueryRequest).

        Parameters:
        req - query request
        Returns:
        an instance of StatsSource to use in creating a query Weight
      • clear

        public void clear()
        Clear cached statistics.
      • approxCheckMissingStats

        public int approxCheckMissingStats​(ResponseBuilder rb,
                                           StatsSource statsSource,
                                           Consumer<org.apache.lucene.index.Term> missingTermStats,
                                           Consumer<String> missingFieldStats)
                                    throws IOException
        Check if the statsSource is missing some term or field statistics info, which then needs to be retrieved.

        NOTE: this uses the local IndexReader for query rewriting, which may expand to less (or different) terms as rewriting the same query on other shards' readers. This in turn may falsely fail to inform the consumers about possibly missing stats, which may lead consumers to skip the fetching of full stats. Consequently this would lead to incorrect global IDF data for the missing terms (because for these terms only local stats would be used).

        Parameters:
        rb - request to evaluate against the statsSource
        statsSource - stats source to check
        missingTermStats - consumer of missing term stats
        missingFieldStats - consumer of missing field stats
        Returns:
        approximate number of missing term stats and field stats combined
        Throws:
        IOException