Block Join Query Parser
There are two query parsers that support block joins. These parsers allow indexing and searching for relational content that has been indexed as Nested Documents.
The example usage of the query parsers below assumes the following documents have been indexed:
<add>
<doc>
<field name="id">1</field>
<field name="content_type">parent</field>
<field name="title">Solr has block join support</field>
<doc>
<field name="id">2</field>
<field name="content_type">child</field>
<field name="comments">SolrCloud supports it too!</field>
</doc>
</doc>
<doc>
<field name="id">3</field>
<field name="content_type">parent</field>
<field name="title">New Lucene and Solr release</field>
<doc>
<field name="id">4</field>
<field name="content_type">child</field>
<field name="comments">Lots of new features</field>
</doc>
</doc>
</add>
Block Join Children Query Parser
This parser wraps a query that matches some parent documents and returns the children of those documents.
Using parentPath
If your schema supports nested documents, you should specify parentPath.
Specify the path at which the parent documents live:
q={!child parentPath=<path>}<someParents>
Key points about parentPath:
-
Must start with
/. -
Use
parentPath="/"to treat root-level documents as the parents. -
A trailing
/is stripped automatically (e.g.,"/skus/"is treated as"/skus"). -
parentPathandofare mutually exclusive; specifying both returns a400 Bad Requesterror. -
Optionally, use
childPathto narrow the returned children to docs at exactlyparentPath/childPath. WithoutchildPath, all descendants of parents atparentPathare returned.
For example, using the deeply nested documents described in Searching Nested Child Documents, the following query returns all children of root-level product documents that match a description query:
q={!child parentPath="/"}description_t:staplers
To return only skus children of root documents matching a description query (excluding other child types):
q={!child parentPath="/" childPath="skus"}description_t:staplers
Using the of Parameter
This approach is used with anonymous child documents (schemas without nest_path).
It is more verbose and has some gotchas.
The syntax is: q={!child of=<blockMask>}<someParents>.
-
The inner subordinate query string (
someParents) must be a query that will match some parent documents. -
The
ofparameter must be a query string to use as a Block Mask — typically a query that matches the set of all possible parent documents.
The resulting query will match all documents which do not match the <blockMask> query and are children (or descendents) of the documents matched by <someParents>.
Using the example documents above, we can construct a query such as q={!child of="content_type:parent"}title:lucene.
We only get one document in response:
<result name="response" numFound="1" start="0">
<doc>
<str name="id">4</str>
<arr name="content_type"><str>child</str></arr>
<str name="comments">Lots of new features</str>
</doc>
</result>
|
The query for Parent query must not match any docs besides parent filter. Combine them as must (+) and must-not (-) clauses to find a problem doc. You can search for |
Filtering and Tagging
{!child} also supports filters and excludeTags local params like the following:
?q={!child of=<blockMask> filters=$parentfq excludeTags=certain}<someParents>
&parentfq=BRAND:Foo
&parentfq=NAME:Bar
&parentfq={!tag=certain}CATEGORY:Baz
This is equivalent to:
q={!child of=<blockMask>}+<someParents> +BRAND:Foo +NAME:Bar
Notice "$" syntax in filters for referencing queries; comma-separated tags excludeTags allows to exclude certain queries by tagging.
Overall the idea is similar to excluding fq in facets.
Note, that filtering is applied to the subordinate clause (<someParents>), and the intersection result is joined to the children.
Block Join Parent Query Parser
This parser takes a query that matches child documents and returns their parents.
Using parentPath
If your schema supports nested documents, you should specify parentPath.
Specify the path at which the parent documents live:
q={!parent parentPath=<path>}<someChildren>
Key points about parentPath:
-
Must start with
/. -
Use
parentPath="/"to treat root-level documents as the parents. -
A trailing
/is stripped automatically (e.g.,"/skus/"is treated as"/skus"). -
parentPathandwhichare mutually exclusive; specifying both returns a400 Bad Requesterror. -
Optionally, use
childPathto constrain the child query to docs at exactlyparentPath/childPath. WithoutchildPath, all descendants ofparentPathare eligible as children.
For example, using the deeply nested documents described in Searching Nested Child Documents, the following query returns the root-level product documents that are ancestors of manuals with exactly one page:
q={!parent parentPath="/"}pages_i:1
To instead return the skus that are ancestors of one-page manuals (only manuals, not other sku children):
q={!parent parentPath="/skus" childPath="manuals"}pages_i:1
Filtering to a Specific Nest Path
When the subordinate query is omitted (empty), {!parent parentPath=<path>} is a convenient way to filter documents to exactly a specific nest path without needing to reference nest_path directly:
# Return all root-level documents (no _nest_path_):
q={!parent parentPath=/}
# Return all documents at exactly /skus (not deeper descendants like /skus/manuals):
q={!parent parentPath=/skus}
Using the which Parameter
This approach is used with anonymous child documents (schemas without nest_path).
It is more verbose and has some gotchas.
The syntax is: q={!parent which=<blockMask>}<someChildren>.
-
The inner subordinate query string (
someChildren) must be a query that will match some child documents. -
The
whichparameter must be a query string to use as a Block Mask — typically a query that matches the set of all possible parent documents.
The resulting query will match all documents which do match the <blockMask> query and are parents (or ancestors) of the documents matched by <someChildren>.
Again using the example documents above, we can construct a query such as q={!parent which="content_type:parent"}comments:SolrCloud.
We get this document in response:
<result name="response" numFound="1" start="0">
<doc>
<str name="id">1</str>
<arr name="content_type"><str>parent</str></arr>
<arr name="title"><str>Solr has block join support</str></arr>
</doc>
</result>
|
The query for Child query must not match same docs with parent filter. Combine them as must clauses (+) to find a problem doc. You can search for |
Filtering and Tagging
The {!parent} query supports filters and excludeTags local params like the following:
?q={!parent which=<blockMask> filters=$childfq excludeTags=certain}<someChildren>
&childfq=COLOR:Red
&childfq=SIZE:XL
&childfq={!tag=certain}PRINT:Hatched
This is equivalent to:
q={!parent which=<blockMask>}+<someChildren> +COLOR:Red +SIZE:XL
Notice the "$" syntax in filters for referencing queries.
Comma-separated tags in excludeTags allow excluding certain queries by tagging.
Overall the idea is similar to excluding fq in facets.
Note that filtering is applied to the subordinate clause (<someChildren>) first, and the intersection result is joined to the parents.
Scoring with the Block Join Parent Query Parser
You can optionally use the score local parameter to return scores of the subordinate query.
The values to use for this parameter define the type of aggregation, which are avg (average), max (maximum), min (minimum), total (sum).
Implicit default is none which returns 0.0.
Block Masks: The of and which local params
The purpose of the "Block Mask" query specified as either an of or which param (depending on the parser used) is to identify the set of all documents in the index which should be treated as "parents" (or their ancestors) and which documents should be treated as "children".
This is important because in the "on disk" index, the relationships are flattened into "blocks" of documents, so the of / which params are needed to serve as a "mask" against the flat document blocks to identify the boundaries of every hierarchical relationship.
In the example queries above, we were able to use a very simple Block Mask of doc_type:parent because our data is very simple: every document is either a parent or a child.
So this query string easily distinguishes all of our documents.
A common mistake is to try and use a which parameter that is more restrictive than the set of all parent documents, in order to filter the parents that are matched, as in this bad example:
// BAD! DO NOT USE!
q={!parent which="title:join"}comments:support
This type of query will frequently not work the way you might expect.
Since the which param only identifies some of the "parent" documents, the resulting query can match "parent" documents it should not, because it will mistakenly identify all documents which do not match the which="title:join" Block Mask as children of the next "parent" document in the index (that does match this Mask).
A similar problematic situation can arise when mixing parent/child documents with "simple" documents that have no children and do not match the query used to identify 'parent' documents. For example, if we add the following document to our existing parent/child example documents:
<add>
<doc>
<field name="id">0</field>
<field name="content_type">plain</field>
<field name="title">Lucene and Solr are cool</field>
</doc>
</add>
…then our simple doc_type:parent Block Mask would no longer be adequate.
We would instead need to use *:* -doc_type:child or doc_type:(simple parent) to prevent our "simple" document from mistakenly being treated as a "child" of an adjacent "parent" document.
The Searching Nested Child Documents section contains more detailed examples of specifying Block Mask queries with nontrivial hierarchies of documents.