You might want to interpret some document fields in more than one way. Solr has a mechanism for making copies of fields so that you can apply several distinct field types to a single piece of incoming information.
The name of the field you want to copy is the source, and the name of the copy is the destination. In
schema.xml, it’s very simple to make copies of fields:
<copyField source="cat" dest="text" maxChars="30000" />
In this example, we want Solr to copy the
cat field to a field named
text. Fields are copied before analysis is done, meaning you can have two fields with identical original content, but which use different analysis chains and are stored in the index differently.
In the example above, if the
text destination field has data of its own in the input documents, the contents of the
cat field will be added as additional values – just as if all of the values had originally been specified by the client. Remember to configure your fields as
multivalued="true" if they will ultimately get multiple values (either from a multivalued source or from multiple
A common usage for this functionality is to create a single "search" field that will serve as the default query field when users or clients do not specify a field to query. For example,
body may all be fields that should be searched by default, with copy field rules for each field to copy to a
catchall field (for example, it could be named anything). Later you can set a rule in
solrconfig.xml to search the
catchall field by default. One caveat to this is your index will grow when using copy fields. However, whether this becomes problematic for you and the final size will depend on the number of fields being copied, the number of destination fields being copied to, the analysis in use, and the available disk space.
maxChars parameter, an
int parameter, establishes an upper limit for the number of characters to be copied from the source value when constructing the value added to the destination field. This limit is useful for situations in which you want to copy some data from the source field, but also control the size of index files.
Both the source and the destination of
copyField can contain either leading or trailing asterisks, which will match anything. For example, the following line will copy the contents of all incoming fields that match the wildcard pattern
*_t to the text field.:
<copyField source="*_t" dest="text" maxChars="25000" />
copyField command can use a wildcard (*) character in the
dest parameter only if the
source parameter contains one as well.
copyField uses the matching glob from the source field for the
dest field name into which the source content is copied.
Copying is done at the stream source level and no copy feeds into another copy. This means that copy fields cannot be chained i.e., you cannot copy from
there and then from
elsewhere. However, the same source field can be copied to multiple destination fields:
<copyField source="here" dest="there"/>
<copyField source="here" dest="elsewhere"/>