Configuring solr.xml
The solr.xml
file defines some global configuration options that apply to all or many cores.
This section will describe the default solr.xml
file included with Solr and how to modify it for your needs.
For details on how to configure core.properties
, see the section Core Discovery.
Defining solr.xml
You can find solr.xml
in your $SOLR_HOME
directory (usually server/solr
or /var/solr/data
) or optionally in ZooKeeper when using SolrCloud. If $SOLR_HOME/solr.xml
is not found, Solr will use the default solr.xml
file.
Loading solr.xml from Zookeeper is deprecated, and will not be supported in a future version.
Being the node config of Solr, this file must be available at early startup and also be allowed to differ between nodes.
|
The default solr.xml
file is found in $SOLR_TIP/server/solr/solr.xml
and looks like this:
<solr>
<int name="maxBooleanClauses">${solr.max.booleanClauses:1024}</int>
<str name="sharedLib">${solr.sharedLib:}</str>
<str name="modules">${solr.modules:}</str>
<str name="allowPaths">${solr.allowPaths:}</str>
<str name="allowUrls">${solr.allowUrls:}</str>
<str name="hideStackTrace">${solr.hideStackTrace:false}</str>
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${solr.port.advertise:0}</int>
<str name="hostContext">${hostContext:solr}</str>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
<int name="zkClientTimeout">${zkClientTimeout:30000}</int>
<int name="distribUpdateSoTimeout">${distribUpdateSoTimeout:600000}</int>
<int name="distribUpdateConnTimeout">${distribUpdateConnTimeout:60000}</int>
<str name="zkCredentialsProvider">${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}</str>
<str name="zkACLProvider">${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}</str>
<str name="zkCredentialsInjector">${zkCredentialsInjector:org.apache.solr.common.cloud.DefaultZkCredentialsInjector}</str>
<bool name="distributedClusterStateUpdates">${distributedClusterStateUpdates:false}</bool>
<bool name="distributedCollectionConfigSetExecution">${distributedCollectionConfigSetExecution:false}</bool>
<int name="minStateByteLenForCompression">${minStateByteLenForCompression:-1}</int>
<str name="stateCompressor">${stateCompressor:org.apache.solr.common.util.ZLibCompressor}</str>
</solrcloud>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:600000}</int>
<int name="connTimeout">${connTimeout:60000}</int>
</shardHandlerFactory>
<metrics enabled="${metricsEnabled:true}">
<!--reporter name="jmx_metrics" group="core" class="org.apache.solr.metrics.reporters.SolrJmxReporter"/-->
</metrics>
</solr>
As you can see, the discovery Solr configuration is "SolrCloud friendly".
However, the presence of the <solrcloud>
element does not mean that the Solr instance is running in SolrCloud mode.
Unless the -DzkHost
or -DzkRun
are specified at startup time, this section is ignored.
Solr.xml Parameters
The <solr> Element
There are no attributes that you can specify in the <solr>
tag, which is the root element of solr.xml
.
The tables below list the child nodes of each XML element in solr.xml
.
configSetService
-
Optional
Default:
configSetService
This attribute does not need to be set.
If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from
ConfigSetService
, and you must provide a constructor with one parameter of typeorg.apache.solr.core.CoreContainer
. For example,<str name="configSetService">com.myorg.CustomConfigSetService</str>
.If this attribute isn’t set, Solr uses the default
configSetService
, with zookeeper aware oforg.apache.solr.cloud.ZkConfigSetService
, without zookeeper aware oforg.apache.solr.core.FileSystemConfigSetService
. adminHandler
-
Optional
Default:
org.apache.solr.handler.admin.CoreAdminHandler
This attribute does not need to be set.
If used, this attribute should be set to the FQN (Fully qualified name) of a class that inherits from CoreAdminHandler. For example,
<str name="adminHandler">com.myorg.MyAdminHandler</str>
would configure the custom admin handler (MyAdminHandler) to handle admin requests.If this attribute isn’t set, Solr uses the default admin handler,
org.apache.solr.handler.admin.CoreAdminHandler
. coreAdminHandlerActions
-
Optional
Default: none
This attribute does not need to be set.
If defined, it should contain a list of custom actions to be registered within the CoreAdminHandler. Each entry of the list should be of
str
type where name of the entry defines the name of the action and value is a FQN (Fully qualified name) of an action class that inherits fromCoreAdminOp
.For example, actions can be defined like this:
<coreAdminHandlerActions> <str name="foo">com.example.FooAction</str> <str name="bar">com.example.BarAction</str> </coreAdminHandlerActions>
After defining custom actions they can be called using their names:
http://localhost:8983/solr/admin/cores?action=foo
collectionsHandler
-
Optional
Default:
org.apache.solr.handler.admin.CollectionsHandler
As above, for custom CollectionsHandler implementations.
infoHandler
-
Optional
Default:
org.apache.solr.handler.admin.InfoHandler
As above, for custom InfoHandler implementations.
coreLoadThreads
-
Optional
Default: none
Specifies the number of threads that will be assigned to load cores in parallel.
replayUpdatesThreads
-
Optional
Default: see description
Specifies the number of threads that will be assigned to replay updates in parallel. This pool is shared for all cores of the node. The default value is equal to the number of processors.
indexSearcherExecutorThreads
-
Optional
Default: number of available processors
Specifies the number of threads that will be assigned for search queries.
coreRootDirectory
-
Optional
Default:
server/solr
The root of the core discovery tree, defaults to
$SOLR_HOME
. coresLocator
-
Optional
Default:
org.apache.solr.core.CorePropertiesLocator
This attribute does not need to be set.
If used, this attribute should be set to the FQN (fully qualified name) of a class that implements
CoresLocator
, and you must provide a constructor with one parameter of typeorg.apache.solr.core.NodeConfig
. For example,<str name="coresLocator">com.myorg.CustomCoresLocator</str>
would configure a custom cores locator. coreSorter
-
Optional
Default:
org.apache.solr.core.CoreSorter
This attribute does not need to be set.
If used, this attribute should be set to the FQN (fully qualified name) of a class that implements
CoreSorter
, and you must provide a constructor with one parameter of typeorg.apache.solr.core.CoreContainer
. This service is used when Solr is starting to prioritize which cores should be loaded first. For example,<str name="coresLocator">com.myorg.CustomCoresLocator</str>
would configure a custom core sorter. managementPath
-
Optional
Default: none
Currently non-operational.
sharedLib
-
Optional
Default: none
Specifies the path to a common library directory that will be shared across all cores. Any JAR files in this directory will be added to the search path for Solr plugins. If the specified path is not absolute, it will be relative to
$SOLR_HOME
. Custom handlers may be placed in this directory. Note that specifyingsharedLib
will not remove$SOLR_HOME/lib
from Solr’s class path. modules
-
Optional
Default: none
Takes a list of bundled Solr Modules to enable on startup. This way of adding modules will add them to the shared class loader, making them available to every collection in Solr, unlike
<lib>
tag insolrconfig.xml
which is only for that one collection. Example value:extracting,ltr
. See the Solr Modules chapter for more details. allowPaths
-
Optional
Default: none
Solr will normally only access folders relative to
$SOLR_HOME
,$SOLR_DATA_HOME
orcoreRootDir
. If you need to e.g., create a core outside of these paths, you can explicitly allow the path withallowPaths
. It is a comma separated string of file system paths to allow. The special value of*
will allow any path on the system.
allowUrls
-
Optional
Default: see description
Comma-separated list of Solr hosts to allow.
The HTTP/HTTPS protocol may be omitted, and only the host and port are checked, i.e.,
10.0.0.1:8983/solr,10.0.0.1:8984/solr
.When running Solr as a user-managed cluster and using the
shards
parameter, a list of hosts needs to be specifically configured as allowed or Solr will forbid the request.In SolrCloud mode, the allow-list is automatically configured to include all live nodes in the cluster.
The allow-list can also be configured with the
solr.allowUrls
system property insolr.in.sh
/solr.in.cmd
. If you need to disable this feature for backwards compatibility, you can set the system propertysolr.disable.allowUrls=true
. hideStackTrace
-
Optional
Default: none
When this attribute is set to
true
, Solr will not return any stack traces in the HTTP response in case of errors. By default (false
), stack traces are hidden only for predictable Solr exceptions, but are returned in the response for unexpected exceptions (i.e.: HTTP 500) shareSchema
-
Optional
Default: none
This attribute, when set to
true
, ensures that the multiple cores pointing to the same Schema resource file will be referring to the same IndexSchema Object. Sharing the IndexSchema Object makes loading the core faster. If you use this feature, make sure that no core-specific property is used in your Schema file. transientCacheSize
-
Optional
Default: none
Deprecated as of 9.2. Defines how many Solr cores with
transient=true
that can be loaded before unloading an unused core for one that is needed. configSetBaseDir
-
Optional
Default:
$SOLR_HOME/configsets
The directory under which configsets for Solr cores can be found.
maxBooleanClauses
-
Optional
Default: see description
Sets the maximum number of (nested) clauses allowed in any query.
This global limit provides a safety constraint on the total number of clauses allowed in any query against any collection — regardless of whether those clauses were explicitly specified in a query string, or were the result of query expansion/re-writing from a more complex type of query based on the terms in the index. This limit is enforced at multiple points in Lucene, both to prevent primitive query objects (mainly
BooleanQuery
) from being constructed with an excessive number of clauses in a way that may exhaust the JVM heap, but also to ensure that no composite query (made up of multiple primitive queries) can be executed with an excessive total number of nested clauses in a way that may cause a search thread to use excessive CPU.In default configurations this property uses the value of the
solr.max.booleanClauses
system property if specified. This is the same system property used in the_default
configset for the<maxBooleanClauses>
element ofsolrconfig.xml
making it easy for Solr administrators to increase both values (in all collections) without needing to search through and update all of their configs.<maxBooleanClauses>${solr.max.booleanClauses:1024}</maxBooleanClauses>
The <solrcloud> Element
This element defines several parameters that relate so SolrCloud.
This section is ignored unless the Solr instance is started with either -DzkRun
or -DzkHost
distribUpdateConnTimeout
-
Optional
Default: none
Used to set the underlying
connTimeout
for intra-cluster updates. distribUpdateSoTimeout
-
Optional
Default: none
Used to set the underlying
socketTimeout
for intra-cluster updates. host
-
Optional
Default: none
The hostname Solr uses to access cores.
hostContext
-
Optional
Default: none
The url context path.
hostPort
-
Optional
Default:
${solr.port.advertise:0}
The port Solr uses to access cores, and advertise Solr node locations through liveNodes. This option is only necessary if a Solr instance is listening on a different port than it wants other nodes to contact it at. For example, if the Solr node is running behind a proxy or in a cloud environment that allows for port mapping, such as Kubernetes.
hostPort
is the port that the Solr instance wants other nodes to contact it at.In the default
solr.xml
file, this is set to${solr.port.advertise:0}
. If no port is passed via thesolr.xml
(i.e.,0
), then Solr will default to the port that jetty is listening on, defined by${jetty.port}
. leaderVoteWait
-
Optional
Default: none
When SolrCloud is starting up, how long each Solr node will wait for all known replicas for that shard to be found before assuming that any nodes that haven’t reported are down.
leaderConflictResolveWait
-
Optional
Default:
180000
millisecondsWhen trying to elect a leader for a shard, this property sets the maximum time a replica will wait to see conflicting state information to be resolved; temporary conflicts in state information can occur when doing rolling restarts, especially when the node hosting the Overseer is restarted.
Typically, the default value of
180000
(ms) is sufficient for conflicts to be resolved; you may need to increase this value if you have hundreds or thousands of small collections in SolrCloud. zkClientTimeout
-
Optional
Default: none
A timeout for connection to a ZooKeeper server. It is used with SolrCloud.
zkHost
-
Optional
Default: none
In SolrCloud mode, the URL of the ZooKeeper host that Solr should use for cluster state information.
genericCoreNodeNames
-
Optional
Default: none
If
true
, node names are not based on the address of the node, but on a generic name that identifies the core. When a different machine takes over serving that core things will be much easier to understand. zkCredentialsProvider
,zkACLProvider
&zkCredentialsInjector
-
Optional
Default: none
Optional parameters that can be specified if you are using ZooKeeper Access Control.
distributedClusterStateUpdates
-
Optional
Default: none
If
true
, the internal behavior of SolrCloud is changed to not use the Overseer for collections'state.json
updates but do this directly against ZooKeeper. minStateByteLenForCompression
-
Optional
Default: -1
Optional parameter to enable compression of the state.json over the wire and stored in Zookeeper. The value provided is the minimum length of bytes to compress state.json, i.e. any state.json above that size in bytes will be compressed. The default is -1, meaning state.json is always uncompressed.
stateCompressor
-
Optional
Default:org.apache.solr.common.util.ZLibCompressor
Optional parameter to provide a compression implementation for state.json over the wire and stored in Zookeeper. The value provided is the class to use for state compression. This is only used if minStateByteLenForCompression is set to a value above -1.
The <logging> Element
class
-
Optional
Default: none
The class to use for logging. The corresponding JAR file must be available to Solr, perhaps through a
<lib>
directive insolrconfig.xml
. enabled
-
Optional
Default:
true
Whether to enable logging or not.
The <shardHandlerFactory> Element
Solr uses "Shard Handlers" to send and track the inter-node requests made internally to process a distributed search or other request.
A factory, configured via the <shardHandlerFactory>
element, is used to create new Shard Handlers as needed.
The factory defined here will be used throughout Solr, unless overridden by particular requestHandler’s in solrconfig.xml.
Two factory implementations are available, each creating a corresponding Shard Handler.
The default, HttpShardHandlerFactory
, serves as the best option for most deployments.
However some deployments, especially those using authentication or with massively sharded collections, may benefit from the additional parallelization offered by ParallelHttpShardHandlerFactory
.
Custom shard handlers are also supported and should be referenced in solr.xml
by their fully-qualified class name:
<shardHandlerFactory name="ShardHandlerFactory" class="qualified.class.name"/>
Sub-elements of <shardHandlerFactory>
may vary in the case of custom shard handlers, but both HttpShardHandlerFactory
and ParallelShardHandlerFactory
support the following configuration options:
socketTimeout
-
Optional
Default: see description
The read timeout for intra-cluster query and administrative requests. The default is the same as the
distribUpdateSoTimeout
specified in the<solrcloud>
section. connTimeout
-
Optional
Default: see description
The connection timeout for intra-cluster query and administrative requests. Defaults to the
distribUpdateConnTimeout
specified in the<solrcloud>
section. urlScheme
-
Optional
Default: none
The URL scheme to be used in distributed search.
maxConnectionsPerHost
-
Optional
Default:
100000
Maximum connections allowed per host.
corePoolSize
-
Optional
Default:
0
The initial core size of the threadpool servicing requests.
maximumPoolSize
-
Optional
Default: none
The maximum size of the threadpool servicing requests. Default is unlimited.
maxThreadIdleTime
-
Optional
Default:
5
secondsThe amount of time in seconds that idle threads persist for in the queue, before being killed.
sizeOfQueue
-
Optional
Default: none
If the threadpool uses a backing queue, what is its maximum size to use direct handoff. Default is to use a SynchronousQueue.
fairnessPolicy
-
Optional
Default:
false
A boolean to configure if the threadpool favors fairness over throughput.
replicaRouting
-
Optional
Default: see description
A NamedList specifying replica routing preference configuration. This may be used to select and configure replica routing preferences.
default=true
may be used to set the default base replica routing preference. Only positive default status assertions are respected; i.e.,default=false
has no effect. If no explicit default base replica routing preference is configured, the implicit default will berandom
.
<shardHandlerFactory class="HttpShardHandlerFactory"> <lst name="replicaRouting"> <lst name="stable"> <bool name="default">true</bool> <str name="dividend">routingDividend</str> <str name="hash">q</str> </lst> </lst> </shardHandlerFactory>
Replica routing may also be specified (overriding defaults) per-request, via the shards.preference
request parameter.
If a request contains both dividend
and hash
, dividend
takes priority for routing.
For configuring stable
routing, the hash
parameter implicitly defaults to a hash of the String value of the main query parameter (i.e., q
).
+
The dividend
parameter must be configured explicitly; there is no implicit default.
If only dividend
routing is desired, hash
may be explicitly set to the empty string, entirely disabling implicit hash-based routing.
The <replicaPlacementFactory> Element
A default replica placement plugin can be defined in solr.xml
.
To allow this, the solr.cluster.plugin.edit.enabled
System Property must be set to false. This setting will disable the /cluster/plugins
edit APIs, preventing modification of cluster plugins at runtime.
<replicaPlacementFactory class="org.apache.solr.cluster.placement.plugins.AffinityPlacementFactory">
<int name="minimalFreeDiskGB">10</int>
<int name="prioritizedFreeDiskGB">200</int>
</replicaPlacementFactory>
The class
attribute should be set to the FQN (fully qualified name) of a class that extends PlacementPluginFactory
.
Sub-elements are specific to the implementation.
The <clusterSingleton> Element
One or more clusterSingleton
elements may be declared in solr.xml.
To allow this, the solr.cluster.plugin.edit.enabled
System Property must be set to false. This setting will disable the /cluster/plugins
edit APIs, preventing modification of cluster plugins at runtime.
Each clusterSingleton
element specifies a cluster plugin that should be loaded when Solr stars, along with it’s associated configuration.
<clusterSingleton name="pluginName" class="qualified.plugin.class">
<int name="value1">20</int>
</clusterSingleton>
The name
attribute is required and must be unique for each clusterSingleton
.
The class
attribute should be set to the FQN (fully qualified name) of a class that extends ClusterSingleton
.
Sub-elements are specific to the implementation, value1
is provided as an example here.
The <metrics> Element
The <metrics>
element in solr.xml
allows you to customize the metrics reported by Solr.
You can define system properties that should not be returned, or define custom suppliers and reporters.
If you would like to customize the metrics for your installation, see the Metrics Configuration section.
The <caches> Element
The <caches>
element in solr.xml
supports defining and configuring named node-level caches.
These caches are analogous to user-defined caches in solrconfig.xml
, except that each named cache exists as a long-lived singleton at the node level. These node-level caches are accessible from application code via CoreContainer.getCache(String cacheName)
.
Note that because node-level caches exist above the context of an individual core, config parameters that hook into the lifecycle of a core/searcher (such as autowarmCount
and regenerator
) are irrelevant/ignored for node-level caches.
<solr>
<caches>
<cache name="myNodeLevelUserCache"
class="solr.CaffeineCache"
size="4096"
initialSize="1024" />
</caches>
</solr>
Substituting JVM System Properties in solr.xml
Solr supports variable substitution of JVM system property values in solr.xml
, which allows runtime specification of various configuration options.
The syntax is ${propertyname[:option default value]}
.
This allows defining a default that can be overridden when Solr is launched.
If a default value is not specified, then the property must be specified at runtime or the solr.xml
file will generate an error when parsed.
Any JVM system properties usually specified using the -D
flag when starting the JVM, can be used as variables in the solr.xml
file.
For example, in the solr.xml
file shown below, the socketTimeout
and connTimeout
values are each set to "60000".
However, if you start Solr using bin/solr start -DsocketTimeout=1000
, the socketTimeout
option of the HttpShardHandlerFactory
to be overridden using a value of 1000ms, while the connTimeout
option will continue to use the default property value of "60000".
<solr>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:60000}</int>
<int name="connTimeout">${connTimeout:60000}</int>
</shardHandlerFactory>
</solr>