JSON Faceting Domain Changes
Facet computation operates on a "domain" of documents. By default, this domain consists of the documents matched by the main query. For sub-facets, the domain consists of all documents placed in their bucket by the parent facet.
Users can also override the "domain" of a facet that partitions data, using an explicit domain
attribute whose value is a JSON object that can support various options for restricting, expanding, or completely changing the original domain before the buckets are computed for the associated facet.
A |
Adding Domain Filters
The simplest example of a domain change is to specify an additional filter which will be applied to the existing domain. This can be done via the filter
keyword in the domain
block of the facet.
curl
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "*:*",
"facet": {
"categories": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": "popularity:[5 TO 10]"
}
}
}
}'
SolrJ
final TermsFacetMap categoryFacet = new TermsFacetMap("cat")
.setLimit(3)
.withDomain(new DomainMap().withFilter("popularity:[5 TO 10]"));
final JsonQueryRequest request = new JsonQueryRequest()
.setQuery("*:*")
.withFacet("categories", categoryFacet);
QueryResponse queryResponse = request.process(solrClient, COLLECTION_NAME);
The value of filter
can be a single query to treat as a filter, or a JSON list of filter queries. Each query can be:
- a string containing a query in Solr query syntax.
- a reference to a request parameter containing Solr query syntax, of the form:
{param: <request_param_name>}
. It’s possible to refer to one or multiple queries in DSL syntax defined under queries key in JSON Request API. The referred parameter might have 0 (absent) or many values.- When no values are specified, no filter is applied and no error is thrown.
- When many values are specified, each value is parsed and used as filters in conjunction.
Here is the example of referencing DSL queries:
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "*:*",
"queries": {
"sample_filtrs":[
{"field":{"f":"text", "query":"usb"}},
{"field":{"f":"text", "query":"lcd"}}
],
"another_filtr":
{"field":{"f":"text", "query":"usb"}}
},
"facet": {
"usblcd": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": {"param":"sample_filtrs"}
}
},
"justusb": {
"type": "terms",
"field": "cat",
"limit": 3,
"domain": {
"filter": {"param":"another_filtr"}
}
}
}
}'
When a filter
option is combined with other domain
changing options, the filtering is applied after the other domain changes take place.
Filter Exclusions
Domains can also be expanded by using the excludeTags
keyword to discard or ignore particular tagged query filters.
This is used in the example below to show the top two manufacturers matching a search. The search results match the filter manu_id_s:apple
, but the computed facet discards this filter and operates a domain widened by discarding the manu_id_s
filter.
curl
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "cat:electronics",
"filter": "{!tag=MANU}manu_id_s:apple",
"facet": {
"stock": {"type": "terms", "field": "inStock", "limit": 2},
"manufacturers": {
"type": "terms",
"field": "manu_id_s",
"limit": 2,
"domain": { "excludeTags":"MANU" }
}
}
}'
SolrJ
final TermsFacetMap inStockFacet = new TermsFacetMap("inStock").setLimit(2);
final TermsFacetMap allManufacturersFacet = new TermsFacetMap("manu_id_s")
.setLimit(2)
.withDomain(new DomainMap().withTagsToExclude("MANU"));
final JsonQueryRequest request = new JsonQueryRequest()
.setQuery("cat:electronics")
.withFilter("{!tag=MANU}manu_id_s:apple")
.withFacet("stock", inStockFacet)
.withFacet("manufacturers", allManufacturersFacet);
QueryResponse queryResponse = request.process(solrClient, COLLECTION_NAME);
The value of excludeTags
can be a single string tag, an array of string tags, or comma-separated tags in the single string.
When an excludeTags
option is combined with other domain
changing options, it expands the domain before any other domain changes take place.
See also the section on multi-select faceting.
Arbitrary Domain Query
A query
domain can be specified when you wish to compute a facet against an arbitrary set of documents, regardless of the original domain. The most common use case would be to compute a top level facet against a specific subset of the collection, regardless of the main query. But it can also be useful on nested facets when building Semantic Knowledge Graphs.
Example:
curl
curl http://localhost:8983/solr/techproducts/query -d '
{
"query": "apple",
"facet": {
"popular_categories": {
"type": "terms",
"field": "cat",
"domain": { "query": "popularity:[8 TO 10]" },
"limit": 3
}
}
}'
SolrJ
final TermsFacetMap inStockFacet = new TermsFacetMap("inStock").setLimit(2);
final TermsFacetMap popularCategoriesFacet = new TermsFacetMap("cat")
.withDomain(new DomainMap().withQuery("popularity:[8 TO 10]"))
.setLimit(3);
final JsonQueryRequest request = new JsonQueryRequest()
.setQuery("apple")
.withFacet("popular_categories", popularCategoriesFacet);
QueryResponse queryResponse = request.process(solrClient, COLLECTION_NAME);
The value of query
can be a single query, or a JSON list of queries. Each query can be:
- a string containing a query in Solr query syntax.
- a reference to a request parameter containing Solr query syntax, of the form:
{param: <request_param_name>}
. The referred parameter might have 0 (absent) or many values.- When no values are specified, no error is thrown.
- When many values are specified, each value is parsed and used as queries.
While a query domain can be combined with an additional domain filter , It is not possible to also use excludeTags , because the tags would be meaningless: The query domain already completely ignores the top-level query and all previous filters.
|
Block Join Domain Changes
When a collection contains Nested Documents, the blockChildren
or blockParent
domain options can be used transform an existing domain containing one type of document, into a domain containing the documents with the specified relationship (child or parent of) to the documents from the original domain.
Both of these options work similarly to the corresponding Block Join Query Parsers by taking in a single String query that exclusively matches all parent documents in the collection. If blockParent
is used, then the resulting domain will contain all parent documents of the children from the original domain. If blockChildren
is used, then the resulting domain will contain all child documents of the parents from the original domain.
{
"colors": {
"type": "terms",
"field": "sku_color",
"facet" : {
"brands" : {
"type": "terms",
"field": "product_brand",
"domain": {
"blockParent": "doc_type:product"
}
}}}}
1 | This example assumes we parent documents corresponding to Products, with child documents corresponding to individual SKUs with unique colors, and that our original query was against SKU documents. |
2 | The colors facet will be computed against all of the original SKU documents matching our search. |
3 | For each bucket in the colors facet, the set of all matching SKU documents will be transformed into the set of corresponding parent Product documents. The resulting brands sub-facet will count how many Product documents (that have SKUs with the associated color) exist for each Brand. |
Join Query Domain Changes
A join
domain change option can be used to specify arbitrary from
and to
fields to use in transforming from the existing domain to a related set of documents.
This works very similar to the Join Query Parser, and has the same limitations when dealing with multi-shard collections.
Example:
{
"colors": {
"type": "terms",
"field": "sku_color",
"facet": {
"brands": {
"type": "terms",
"field": "product_brand",
"domain" : {
"join" : {
"from": "product_id_of_this_sku",
"to": "id"
},
"filter": "doc_type:product"
}
}
}
}
}
Graph Traversal Domain Changes
A graph
domain change option works similarly to the join
domain option, but can do traversal multiple hops from
the existing domain to
other documents.
This works very similar to the Graph Query Parser, supporting all of its optional parameters, and has the same limitations when dealing with multi-shard collections.
Example:
{
"related_brands": {
"type": "terms",
"field": "brand",
"domain": {
"graph": {
"from": "related_product_ids",
"to": "id",
"maxDepth": 3
}
}
}
}