There are some major changes in Solr 6 to consider before starting to migrate your configurations and indexes.
There are many hundreds of changes, so a thorough review of the Solr Upgrade Notes section as well as the CHANGES.txt file in your Solr instance will help you plan your migration to Solr 6. This section attempts to highlight some of the major changes you should be aware of.
Highlights of New Features in Solr 6
Some of the major improvements in Solr 6 include:
Streaming Expressions
Introduced in Solr 5, Streaming Expressions allow querying Solr and getting results as a stream of data, sorted and aggregated as requested.
Several new expression types have been added in Solr 6:
-
Parallel expressions using a MapReduce-like shuffling for faster throughput of high-cardinality fields.
-
Daemon expressions to support continuous push or pull streaming.
-
Advanced parallel relational algebra like distributed joins, intersections, unions and complements.
-
Publish/Subscribe messaging.
-
JDBC connections to pull data from other systems and join with documents in the Solr index.
Parallel SQL Interface
Built on streaming expressions, new in Solr 6 is a Parallel SQL interface to be able to send SQL queries to Solr. SQL statements are compiled to streaming expressions on the fly, providing the full range of aggregations available to streaming expression requests. A JDBC driver is included, which allows using SQL clients and database visualization tools to query your Solr index and import data to other systems.
Cross Data Center Replication
Replication across data centers is now possible with Cross Data Center Replication. Using an active-passive model, a SolrCloud cluster can be replicated to another data center, and monitored with a new API.
Graph QueryParser
A new graph
query parser makes it possible to to graph traversal queries of Directed (Cyclic) Graphs modelled using Solr documents.
DocValues
Most non-text field types in the Solr sample configsets now default to using DocValues.
Java 8 Required
The minimum supported version of Java for Solr 6 (and the SolrJ client libraries) is now Java 8.
Index Format Changes
Solr 6 has no support for reading Lucene/Solr 4.x and earlier indexes. Be sure to run the Lucene IndexUpgrader
included with Solr 5.5 if you might still have old 4x formatted segments in your index. Alternatively: fully optimize your index with Solr 5.5 to make sure it consists only of one up-to-date index segment.
Managed Schema is now the Default
Solr’s default behavior when a solrconfig.xml
does not explicitly define a <schemaFactory/>
is now dependent on the luceneMatchVersion
specified in that solrconfig.xml
. When luceneMatchVersion < 6.0
, ClassicIndexSchemaFactory
will continue to be used for back compatibility, otherwise an instance of ManagedIndexSchemaFactory
will be used.
The most notable impacts of this change are:
-
Existing
solrconfig.xml
files that are modified to useluceneMatchVersion >= 6.0
, but do not have an explicitly configuredClassicIndexSchemaFactory
, will have theirschema.xml
file automatically upgraded to amanaged-schema
file. -
Schema modifications via the Schema API will now be enabled by default.
Please review the Schema Factory Definition in SolrConfig section for more details.
Default Similarity Changes
Solr’s default behavior when a Schema does not explicitly define a global <similarity/>
is now dependent on the luceneMatchVersion
specified in the solrconfig.xml
. When luceneMatchVersion < 6.0
, an instance of ClassicSimilarityFactory
will be used, otherwise an instance of SchemaSimilarityFactory
will be used. Most notably this change means that users can take advantage of per Field Type similarity declarations, without needing to also explicitly declare a global usage of SchemaSimilarityFactory
.
Regardless of whether it is explicitly declared, or used as an implicit global default, SchemaSimilarityFactory
's implicit behavior when a Field Types do not declare an explicit <similarity />
has also been changed to depend on the the luceneMatchVersion
. When luceneMatchVersion < 6.0
, an instance of ClassicSimilarity
will be used, otherwise an instance of BM25Similarity
will be used. A defaultSimFromFieldType
init option may be specified on the SchemaSimilarityFactory
declaration to change this behavior. Please review the SchemaSimilarityFactory
javadocs for more details
Replica & Shard Delete Command Changes
DELETESHARD and DELETEREPLICA now default to deleting the instance directory, data directory, and index directory for any replica they delete. Please review the Collection API documentation for details on new request parameters to prevent this behavior if you wish to keep all data on disk when using these commands
facet.date.* Parameters Removed
The facet.date
parameter (and associated facet.date.*
parameters) that were deprecated in Solr 3.x have been removed completely. If you have not yet switched to using the equivalent facet.range
functionality you must do so now before upgrading.