Searching Nested Child Documents

This section exposes potential techniques which can be used for searching deeply nested documents, showcasing how more complex queries can be constructed using some of Solr’s query parsers and document transformers.

These features require _root_ and _nest_path_ to be declared in the schema. Please refer to Indexing Nested Documents for details about schema and index configuration.

This section does not demonstrate faceting on nested documents. For nested document faceting, please refer to the Block Join Facet Counts section.

Query Examples

For the upcoming examples, we’ll assume an index containing the same documents covered in Indexing Nested Documents:

[{ "id": "P11!prod",
   "name_s": "Swingline Stapler",
   "description_t": "The Cadillac of office staplers ...",
   "skus": [ { "id": "P11!S21",
               "color_s": "RED",
               "price_i": 42,
               "manuals": [ { "id": "P11!D41",
                              "name_s": "Red Swingline Brochure",
                              "pages_i":1,
                              "content_t": "..."
                            } ]
             },
             { "id": "P11!S31",
               "color_s": "BLACK",
               "price_i": 3
             } ],
   "manuals": [ { "id": "P11!D51",
                  "name_s": "Quick Reference Guide",
                  "pages_i":1,
                  "content_t": "How to use your stapler ..."
                },
                { "id": "P11!D61",
                  "name_s": "Warranty Details",
                  "pages_i":42,
                  "content_t": "... lifetime guarantee ..."
                } ]
 },
 { "id": "P22!prod",
   "name_s": "Mont Blanc Fountain Pen",
   "description_t": "A Premium Writing Instrument ...",
   "skus": [ { "id": "P22!S22",
               "color_s": "RED",
               "price_i": 89,
               "manuals": [ { "id": "P22!D42",
                              "name_s": "Red Mont Blanc Brochure",
                              "pages_i":1,
                              "content_t": "..."
                            } ]
             },
             { "id": "P22!S32",
               "color_s": "BLACK",
               "price_i": 67
             } ],
   "manuals": [ { "id": "P22!D52",
                  "name_s": "How To Use A Pen",
                  "pages_i":42,
                  "content_t": "Start by removing the cap ..."
                } ]
 } ]

Child Doc Transformer

By default, documents that match a query do not include any of their nested children in the response. The [child] Doc Transformer Can be used enrich query results with the documents' descendants.

For a detailed explanation of this transformer, and specifics on its syntax & limitations, please refer to the section [child - ChildDocTransformerFactory].

A simple query matching all documents with a description that includes "staplers":

$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers'
{
  "response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1672933224035123200}]
  }}

The same query with the addition of the [child] transformer is shown below. Note that the numFound has not changed, we are still matching the same set of documents, but when returning those documents the nested children are also returned as pseudo-fields.

$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=description_t:staplers&fl=*,[child]'
{
  "response":{"numFound":1,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1672933224035123200,
        "skus":[
          {
            "id":"P11!S21",
            "color_s":"RED",
            "price_i":42,
            "_version_":1672933224035123200,
            "manuals":[
              {
                "id":"P11!D41",
                "name_s":"Red Swingline Brochure",
                "pages_i":1,
                "content_t":"...",
                "_version_":1672933224035123200}]},

          {
            "id":"P11!S31",
            "color_s":"BLACK",
            "price_i":3,
            "_version_":1672933224035123200}],
        "manuals":[
          {
            "id":"P11!D51",
            "name_s":"Quick Reference Guide",
            "pages_i":1,
            "content_t":"How to use your stapler ...",
            "_version_":1672933224035123200},

          {
            "id":"P11!D61",
            "name_s":"Warranty Details",
            "pages_i":42,
            "content_t":"... lifetime guarantee ...",
            "_version_":1672933224035123200}]}]
  }}

Child Query Parser

The {!child} query parser can be used to search for the descendent documents of parent documents matching a wrapped query. For a detailed explanation of this parser, see the section Block Join Children Query Parser.

Let’s consider again the description_t:staplers query used above — if we wrap that query in a {!child} query parser then instead of "matching" & returning the product level documents, we instead match all of the descendent child documents of the original query:

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'q={!child of="*:* -_nest_path_:*"}description_t:staplers'
{
  "response":{"numFound":5,"start":0,"maxScore":0.30136836,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1672933224035123200},
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1672933224035123200},
      {
        "id":"P11!S31",
        "color_s":"BLACK",
        "price_i":3,
        "_version_":1672933224035123200},
      {
        "id":"P11!D51",
        "name_s":"Quick Reference Guide",
        "pages_i":1,
        "content_t":"How to use your stapler ...",
        "_version_":1672933224035123200},
      {
        "id":"P11!D61",
        "name_s":"Warranty Details",
        "pages_i":42,
        "content_t":"... lifetime guarantee ...",
        "_version_":1672933224035123200}]
  }}

In this example we’ve used *:* -_nest_path_:* as our of parameter to indicate we want to consider all documents which don’t have a nest path — i.e., all "root" level document — as the set of possible parents.

By changing the of parameter to match ancestors at specific _nest_path_ levels, we can narrow down the list of children we return. In the query below, we search for all descendants of skus (using an of parameter that identifies all documents that do not have a _nest_path_ with the prefix /skus/*) with a price_i less then 50:

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of="*:* -_nest_path_:\\/skus\\/*"}(+price_i:[* TO 50] +_nest_path_:\/skus)'
{
  "response":{"numFound":1,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1675662666752851968}]
  }}
Double Escaping _nest_path_ slashes in of

Note that in the above example, the / characters in the _nest_path_ were "double escaped" in the of parameter:

  • One level of \ escaping is necessary to prevent the / from being interpreted as a Regex Query

  • An additional level of "escaping the escape character" is necessary because the of local parameter is a quoted string; so we need a second \ to ensure the first \ is preserved and passed as is to the query parser.

(You can see that only a single level of \ escaping is needed in the body of the query string — to prevent the Regex syntax —  because it’s not a quoted string local param).

You may find it more convenient to use parameter references in conjunction with other parsers that do not treat / as a special character to express the same query in a more verbose form:

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!child of=$block_mask}(+price_i:[* TO 50] +{!field f="_nest_path_" v="/skus"})' --data-urlencode 'block_mask=(*:* -{!prefix f="_nest_path_" v="/skus/"})'

Parent Query Parser

The inverse of the {!child} query parser is the {!parent} query parser, which lets you search for the ancestor documents of some child documents matching a wrapped query. For a detailed explanation of this parser, see the section Block Join Parent Query Parser.

Let’s first consider this example of searching for all "manual" type documents that have exactly 1 page:

$ curl 'http://localhost:8983/solr/gettingstarted/select?omitHeader=true&q=pages_i:1'
{
  "response":{"numFound":3,"start":0,"maxScore":1.0,"numFoundExact":true,"docs":[
      {
        "id":"P11!D41",
        "name_s":"Red Swingline Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1676585794196733952},
      {
        "id":"P11!D51",
        "name_s":"Quick Reference Guide",
        "pages_i":1,
        "content_t":"How to use your stapler ...",
        "_version_":1676585794196733952},
      {
        "id":"P22!D42",
        "name_s":"Red Mont Blanc Brochure",
        "pages_i":1,
        "content_t":"...",
        "_version_":1676585794347728896}]
  }}

We can wrap that query in a {!parent} query to return the details of all products that are ancestors of these manuals:

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
  "response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!prod",
        "name_s":"Swingline Stapler",
        "description_t":"The Cadillac of office staplers ...",
        "_version_":1676585794196733952},
      {
        "id":"P22!prod",
        "name_s":"Mont Blanc Fountain Pen",
        "description_t":"A Premium Writing Instrument ...",
        "_version_":1676585794347728896}]
  }}

In this example we’ve used *:* -_nest_path_:* as our which parameter to indicate we want to consider all documents which don’t have a nest path — i.e., all "root" level document — as the set of possible parents.

By changing the which parameter to match ancestors at specific _nest_path_ levels, we can change the type of ancestors we return. In the query below, we search for skus (using an which parameter that identifies all documents that do not have a _nest_path_ with the prefix /skus/*) that are the ancestors of manuals with exactly 1 page:

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' --data-urlencode 'q={!parent which="*:* -_nest_path_:\\/skus\\/*"}(+_nest_path_:\/skus\/manuals +pages_i:1)'
{
  "response":{"numFound":2,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1676585794196733952},
      {
        "id":"P22!S22",
        "color_s":"RED",
        "price_i":89,
        "_version_":1676585794347728896}]
  }}

Note that in the above example, the / characters in the _nest_path_ were "double escaped" in the which parameter, for the same reasons discussed above regarding the {!child} pasers `of parameter.

Combining Block Join Query Parsers with Child Doc Transformer

The combination of these two parsers with the `[child] transformer enables seamless creation of very powerful queries.

Here for example is a query where:

  • the (sku) documents returned must have a color of "RED"

  • the (sku) docments returned must be the descendents of root level (product) documents which have:

    • immediate child "manuals" documents which have:

      • "lifetime guarantee" in their content

  • each return (sku) document also includes any descendent (manuals) documents it has

$ curl 'http://localhost:8983/solr/gettingstarted/select' -d 'omitHeader=true' -d 'fq=color_s:RED' --data-urlencode 'q={!child of="*:* -_nest_path_:*" filters=$parent_fq}' --data-urlencode 'parent_fq={!parent which="*:* -_nest_path_:*"}(+_nest_path_:"/manuals" +content_t:"lifetime guarantee")' -d 'fl=*,[child]'
{
  "response":{"numFound":1,"start":0,"maxScore":1.4E-45,"numFoundExact":true,"docs":[
      {
        "id":"P11!S21",
        "color_s":"RED",
        "price_i":42,
        "_version_":1676585794196733952,
        "manuals":[
          {
            "id":"P11!D41",
            "name_s":"Red Swingline Brochure",
            "pages_i":1,
            "content_t":"...",
            "_version_":1676585794196733952}]}]
  }}