Schema Factory Definition in SolrConfig

Solr’s Schema API enables remote clients to access schema information, and make schema modifications, through a REST interface.

Other features such as Solr’s Schemaless Mode also work via schema modifications made programatically at run time.

Using the Managed Schema is required to be able to use the Schema API to modify your schema. However, using Managed Schema does not by itself mean you are also using Solr in Schemaless Mode (or "schema guessing" mode).

Schemaless mode requires enabling the Managed Schema if it is not already, but full schema guessing requires additional configuration as described in the section Schemaless Mode.

While the "read" features of the Schema API are supported for all schema types, support for making schema modifications programatically depends on the <schemaFactory/> in use.

Solr Uses Managed Schema by Default

When a <schemaFactory/> is not explicitly declared in a solrconfig.xml file, Solr implicitly uses a ManagedIndexSchemaFactory, which is by default "mutable" and keeps schema information in a managed-schema file.

 <!-- An example of Solr's implicit default behavior if no
      no schemaFactory is explicitly defined.
 -->
  <schemaFactory class="ManagedIndexSchemaFactory">
    <bool name="mutable">true</bool>
    <str name="managedSchemaResourceName">managed-schema</str>
  </schemaFactory>

If you wish to explicitly configure ManagedIndexSchemaFactory the following options are available:

  • mutable - controls whether changes may be made to the Schema data. This must be set to true to allow edits to be made with the Schema API.
  • managedSchemaResourceName is an optional parameter that defaults to "managed-schema", and defines a new name for the schema file that can be anything other than “schema.xml”.

With the default configuration shown above, you can use the Schema API to modify the schema as much as you want, and then later change the value of mutable to false if you wish to "lock" the schema in place and prevent future changes.

Classic schema.xml

An alternative to using a managed schema is to explicitly configure a ClassicIndexSchemaFactory. ClassicIndexSchemaFactory requires the use of a schema.xml configuration file, and disallows any programatic changes to the Schema at run time. The schema.xml file must be edited manually and is only loaded only when the collection is loaded.

  <schemaFactory class="ClassicIndexSchemaFactory"/>

Switching from schema.xml to Managed Schema

If you have an existing Solr collection that uses ClassicIndexSchemaFactory, and you wish to convert to use a managed schema, you can simply modify the solrconfig.xml to specify the use of the ManagedIndexSchemaFactory.

Once Solr is restarted and it detects that a schema.xml file exists, but the managedSchemaResourceName file (i.e., “managed-schema”) does not exist, the existing schema.xml file will be renamed to schema.xml.bak and the contents are re-written to the managed schema file. If you look at the resulting file, you’ll see this at the top of the page:

<!-- Solr managed schema - automatically generated - DO NOT EDIT -->

You are now free to use the Schema API as much as you want to make changes, and remove the schema.xml.bak.

Switching from Managed Schema to Manually Edited schema.xml

If you have started Solr with managed schema enabled and you would like to switch to manually editing a schema.xml file, you should take the following steps:

  1. Rename the managed-schema file to schema.xml.
  2. Modify solrconfig.xml to replace the schemaFactory class.
    1. Remove any ManagedIndexSchemaFactory definition if it exists.
    2. Add a ClassicIndexSchemaFactory definition as shown above
  3. Reload the core(s).

If you are using SolrCloud, you may need to modify the files via ZooKeeper. The bin/solr script provides an easy way to download the files from ZooKeeper and upload them back after edits. See the section ZooKeeper Operations for more information.

To have full control over your schema.xml file, you may also want to disable schema guessing, which allows unknown fields to be added to the schema during indexing. The properties that enable this feature are discussed in the section Schemaless Mode.