Autoscaling Policy and Preferences
The autoscaling policy and preferences are a set of rules and sorting preferences that help Solr select the target of cluster management operations so the overall load on the cluster remains balanced.
The configured autoscaling policy and preferences are used by Collections API commands in all contexts: manual, for example using bin/solr
to create a collection; semi-automatic, via the Suggestions API or the Admin UI’s Suggestions Screen; or fully automatic, via configured Triggers.
See the section Example: Manual Collection Creation with a Policy for an example of how policy and preferences affect replica placement.
Cluster Preferences Specification
A preference is a hint to Solr on how to sort nodes based on their utilization.
The default cluster preference is to sort by the total number of Solr cores (or replicas) hosted by a node, with a precision of 1. Therefore, by default, when selecting a node to which to add a replica, Solr can apply the preferences and choose the node with the fewest cores. In the case of a tie in the number of cores, available freedisk will be used to further sort nodes.
More than one preference can be added to break ties. For example, we may choose to use free disk space to break ties if the number of cores on two nodes is the same. The node with the higher free disk space can be chosen as the target of the cluster operation.
Each preference takes the following form:
{"<sort_order>":"<sort_param>", "precision":"<precision_val>"}
sort_order
The value can be either
maximize
orminimize
. Chooseminimize
to sort the nodes with least value as the least loaded. For example,{"minimize":"cores"}
sorts the nodes with the least number of cores as the least loaded node. A sort order such as{"maximize":"freedisk"}
sorts the nodes with maximum free disk space as the least loaded node.The objective of the system is to make every node the least loaded. So, in case of a
MOVEREPLICA
operation, it usually targets the most loaded node and takes load off of it. In a sort of more loaded to less loaded,minimize
is akin to sorting in descending order andmaximize
is akin to sorting in ascending order.This is a required parameter.
sort_param
- One and only one of the following supported parameters must be specified:
cores
: The number of total Solr cores on a node.freedisk
: The amount of free disk space for Solr’s data home directory. This is always in gigabytes.sysLoadAvg
: The system load average on a node as reported by the Metrics API under the keysolr.jvm/os.systemLoadAverage
. This is always a double value between 0 and 1 and the higher the value, the more loaded the node is.heapUsage
: The heap usage of a node as reported by the Metrics API under the keysolr.jvm/memory.heap.usage
. This is always a double value between 0 and 1 and the higher the value, the more loaded the node is.
precision
Precision tells the system the minimum (absolute) difference between 2 values to treat them as distinct values.
For example, a precision of 10 for
freedisk
means that two nodes whose free disk space is within 10GB of each other should be treated as equal for the purpose of sorting. This helps create ties without which specifying multiple preferences is not useful. This is an optional parameter whose value must be a positive integer. The maximum value ofprecision
must be less than the maximum value of thesort_value
, if any.
See the section Create and Modify Cluster Preferences for details on how to manage cluster preferences with the API.
Examples of Cluster Preferences
Default Preferences
The following shows the default cluster preferences. This is applied automatically by Solr when no explicit cluster preferences have been set using the Autoscaling API.
[
{"minimize":"cores"}
]
Minimize Cores; Maximize Free Disk
In this example, we want to minimize the number of Solr cores and in case of a tie, maximize the amount of free disk space on each node.
[
{"minimize" : "cores"},
{"maximize" : "freedisk"}
]
Add Precision to Free Disk; Minimize System Load
In this example, we add a precision to the freedisk
parameter so that nodes with free disk space within 10GB of each other are considered equal. In such a case, the tie is broken by minimizing sysLoadAvg
.
[
{"minimize" : "cores"},
{"maximize" : "freedisk", "precision" : 10},
{"minimize" : "sysLoadAvg"}
]
Policy Specification
A policy is a hard rule to be satisfied by each node. If a node does not satisfy the rule then it is called a violation. Solr ensures that the number of violations are minimized while invoking any cluster management operations.
Policy Rule Structure
Rule Types
Policy rules can be either global or per-collection:
- Global rules constrain the number of cores per node or node group. This type of rule applies to cores from all collections hosted on the specified node(s). As a result, collection-specific policies, which are associated with individual collections, may not contain global rules.
- Per-collection rules constrain the number of replicas per node or node group.
Global rules have three parts:
- Node Selector
- Core Count Constraint (
"cores": "…"
) - Rule Strictness (optional)
Per-collection rules have five parts:
- Node Selector
- Replica Selector and Rule Evaluation Context
- Replica Count Constraint (
"replica": "…"
) - Rule Strictness (optional)
put
(optional) specifies how to place these replicas on the selected nodes. All the selected nodes are considered as one bucket by default."put" : "on-each-node"
treats each selected node as a bucket
Node Selector
A node selector is specified using the node
nodeset
attribute. This is used to filter the set of nodes where this rules needs to be applied
examples
{ "replica" : "<2", "node":"#ANY"}
{ "replica" : "3", "nodeset":["node-name1","node-name2"]}
{ "nodeset":{"<property-name>":"<property-value>"}}
The property names can be one of node
, host
, sysprop.
, freedisk
, ip_
, nodeRole
, heapUsage
, metrics.*
when using the nodeset
attribute, an optional attribute put
can be used to specify how to distribute the replicas in that node set.
example: put one replica on each node with a system property zone=east
{ "replica":1, "put" :"on-each-node", "nodeset":{"sysprop.zone":"east"}}
example: put a total of 2 replicas on the set of nodes with property zone=east
{ "replica":2, "put" :"on-each-node", "nodeset":{"sysprop.zone":"east"}}
Rule evaluation is restricted to node(s) matching the value of one of the following attributes: node
, port
, ip_*
, sysprop.*
, or diskType
. For replica/core count constraints other than #EQUAL
, a condition specified in one of the following attributes may instead be used to select nodes: freedisk
, host
, sysLoadAvg
, heapUsage
, nodeRole
, or metrics.*
.
Except for node
, the attributes above cause selected nodes to be partitioned into node groups. A node group is referred to as a "bucket". Those attributes usable with the #EQUAL
directive may define buckets either via the special function #EACH
or an array ["value1", …]
(a subset of all possible values); in both cases, each node is placed in the bucket corresponding to the matching attribute value.
The node
attribute always places each selected node into its own bucket, regardless of the attribute value’s form (#ANY
, node-name
, or ["node1-name", …]
).
Replica and core count constraints, described below, are evaluated against the total number in each bucket.
Core Count Constraint
The cores
attribute value can be specified in one of the following forms:
#EQUAL
: distribute all cores equally across all the selected nodes.- a constraint on the core count on each selected node; see Specifying Replica and Core Count Constraints.
Replica Selector and Rule Evaluation Context
Rule evaluation can be restricted to replicas that meet any combination of conditions specified with the following attributes:
collection
: The replica is of a shard belonging to the collection specified in the attribute value. (Not usable with collection-specific policies.)shard
: The replica is of the shard named in the attribute value.type
: The replica has the specified replica type (NRT
,TLOG
, orPULL
).
If none of the above attributes is specified, then the rule is evaluated separately for each collection against all types of replicas of all shards.
Specifying #EACH
as the shard
attribute value causes the rule to be evaluated separately for each shard of each collection.
Replica Count Constraint
The replica
attribute value can be specified in one of the following forms:
#ALL
: All selected replicas will be placed on the selected nodes.#EQUAL
: Distribute selected replicas equally across all the selected nodes.- a constraint on the replica count on each selected node; see Specifying Replica and Core Count Constraints.
Specifying Replica and Core Count Constraints
Replica count constraints ("replica":"…"
) and core count constraints ("cores":"…"
) allow specification of acceptable counts for replicas (cores tied to a collection) and cores (regardless of the collection to which they belong), respectively.
You can specify one of the following as the value of a replica
and cores
policy rule attribute:
- an exact integer (e.g.,
2
) - an exclusive lower integer bound (e.g.,
>0
) - an exclusive upper integer bound (e.g.,
<3
) - a decimal value, interpreted as an acceptable range of core counts, from the floor of the value to the ceiling of the value, with the system preferring the rounded value (e.g.,
1.6
:1
or2
is acceptable, and2
is preferred) - a range of acceptable replica/core counts, as inclusive lower and upper integer bounds separated by a hyphen (e.g.,
3-5
) - a percentage (e.g.,
33%
), which is multiplied at runtime either by the number of selected replicas (for areplica
constraint) or the number of cores in the cluster (for acores
constraint). This value is then interpreted as described above for a literal decimal value.
Using an exact integer value for count constraints is of limited utility, since collection or cluster changes could quickly invalidate them. For example, attempting to add a third replica to each shard of a collection on a two-node cluster with policy rule {"replica":1, "shard":"#EACH", "node":"#ANY"} would cause a violation, since at least one node would have to host more than one replica. Percentage rules are less brittle. Rewriting the rule as {"replica":"50%", "shard":"#EACH", "node":"#ANY"} eliminates the violation: 50% of 3 replicas = 1.5 replicas per node , meaning that it’s acceptable for a node to host either one or two replicas of each shard.
|
Policy Rule Attributes
Rule Strictness
This attribute is usable in all rules:
strict
- An optional boolean value. The default is
true
. If true, the rule must be satisfied; if the rule is not satisfied, the resulting violation will cause the cluster management operation to fail. If false, Solr tries to satisfy the rule on a best effort basis, but if no node can satisfy the rule, the cluster management operation will not fail, and any node may be chosen. If multiple rules declared to bestrict:false
can not be satisfied by some nodes, then a node will be chosen such that the number of such violations is minimized.
Global Rule Attributes
cores
- The number of cores that must exist to satisfy the rule. This is a required attribute for global policy rules. The
node
attribute must also be specified, and the only other allowed attribute is the optionalstrict
attribute. See Core Count Constraint for possible attribute values.
Per-collection Rule Attributes
The following attributes are usable with per-collection policy rules, in addition to the attributes in the Node Selection Attributes section below:
collection
- The name of the collection to which the policy rule should apply. If omitted, the rule applies to all collections. This attribute is optional.
shard
- The name of the shard to which the policy rule should apply. If omitted, the rule is applied for all shards in the collection. It supports the special function
#EACH
which means that the rule is applied for each shard in the collection.
type
- The type of the replica to which the policy rule should apply. If omitted, the rule is applied for all replica types of this collection/shard. The allowed values are
NRT
,TLOG
andPULL
replica
- The number of replicas that must exist to satisfy the rule. This is a required attribute for per-collection rules. See Replica Count Constraint for possible attribute values.
Node Selection Attributes
One and only one of the following attributes can be specified in addition to the above attributes. See the Node Selector section for more information:
node
- The name of the node to which the rule should apply. The
!
(not) operator or the array operator or the#ANY
function may be used in this attribute’s value.
port
- The port of the node to which the rule should apply. The
!
(not) operator or the array operator may be used in this attribute’s value.
freedisk
- The free disk space in gigabytes of the node. This must be a positive 64-bit integer value, or a percentage. If a percentage is specified, either an upper or lower bound may also be specified using the
<
or>
operators, respectively, e.g.,>50%
,<25%
.
host
- The host name of the node.
sysLoadAvg
- The system load average of the node as reported by the Metrics API under the key
solr.jvm/os.systemLoadAverage
. This is floating point value between 0 and 1.
heapUsage
- The heap usage of the node as reported by the Metrics API under the key
solr.jvm/memory.heap.usage
. This is floating point value between 0 and 1.
nodeRole
- The role of the node. The only supported value currently is
overseer
.
ip_1, ip_2, ip_3, ip_4
- The least significant to most significant segments of IP address. For example, for an IP address
192.168.1.2
,"ip_1":"2", "ip_2":"1", "ip_3":"168", "ip_4":"192"
. The array operator may be used in any of these attributes' values.
sysprop.<system_property_name>
- Any arbitrary system property set on the node on startup. The
!
(not) operator or the array operator may be used in this attribute’s value.
metrics:<full-path-to-the metric>
- Any arbitrary metric. For example,
metrics:solr.node:CONTAINER.fs.totalSpace
. Refer to thekey
parameter in the Metrics API section.
diskType
The type of disk drive being used for Solr’s
coreRootDirectory
. The only two supported values arerotational
andssd
. Refer tocoreRootDirectory
parameter in the Solr.xml Parameters section. The!
(not) operator or the array operator may be used in this attribute’s value.Its value is fetched from the Metrics API with the key named
solr.node:CONTAINER.fs.coreRoot.spins
. The disk type is auto-detected by Lucene using various heuristics and it is not guaranteed to be correct across all platforms or operating systems. Refer to the Dynamic defaults for ConcurrentMergeScheduler section for more details.
Policy Operators
Each attribute in the policy may specify one of the following operators along with the value.
- No operator means equality
<
: Less than>
: Greater than!
: Not(-)
: a value such as"3-5"
means a value between 3 to 5 (inclusive). This is only supported in thereplica
andcores
attributes. Range operator []
: e.g.,sysprop.zone= ["east","west","apac"]
. This is equivalent to having multiple rules with each of these values. This can be used in the following attributes: Array operator
Special Functions
This supports values calculated at the time of execution.
%
: A certain percentage of the value. This is supported by the following attributes:#ANY
: Applies to thenode
attribute only. This means the rule applies to any node.#ALL
: Applies to thereplica
attribute only. This means all replicas that meet the rule condition.#EACH
: Applies to theshard
attribute (meaning the rule should be evaluated separately for each shard), and to the attributes used to define the buckets for the #EQUAL function (meaning all possible values for the bucket-defining attribute).#EQUAL
: Applies to thereplica
andcores
attributes only. This means an equal number of replicas/cores in each bucket. The buckets can be defined using the below attributes with a value that can either be#EACH
or a list specified with the array operator ([]
):node
<- global rules, i.e., those with thecores
attribute, may only specify this attributesysprop.*
port
diskType
ip_*
Examples of Policy Rules
Limit Replica Placement
Do not place more than one replica of the same shard on the same node. The rule is evaluated separately for each shard in each collection. The rule is applied to any node.
{"replica": "<2", "shard": "#EACH", "node": "#ANY"}
Limit Cores per Node
Do not place more than 10 cores in any node. This rule can only be added to the cluster policy because it is a global rule.
{"cores": "<10", "node": "#ANY"}
Place Replicas Based on Port
Place exactly 1 replica of each shard of collection xyz
on a node running on port 8983
.
{"replica": 1, "shard": "#EACH", "collection": "xyz", "nodeset": {"port": "8983"}}
Place Replicas Based on a System Property
Place all replicas on nodes with system property availability_zone=us-east-1a
.
{"replica": "#ALL", "nodeset": {"sysprop.availability_zone": "us-east-1a"}}
Use Percentage
Place a maximum of (roughly) a third of the replicas of each shard in any node. In the following example, the value of replica
is computed in real time as a percentage of the replicas of each shard of each collection:
{"replica": "33%", "shard": "#EACH", "node": "#ANY"}
If the number of replicas in a shard is 2
, 33% of 2 = 0.66
. This means a node may have a maximum of 1
and a minimum of 0
replicas of each shard.
It is possible to get the same effect by hard coding the value of replica
as a decimal value:
{"replica": 0.66, "shard": "#EACH", "node": "#ANY"}
or using the range operator:
{"replica": "0-1", "shard": "#EACH", "node": "#ANY"}
Multiple Percentage Rules
Distribute replicas of each shard of each collection across datacenters east
and west
at a 1:2
ratio:
{"replica": "33%", "shard": "#EACH", "nodeset":{ "sysprop.zone": "east"}}
{"replica": "66%", "shard": "#EACH", "nodeset":{"sysprop.zone": "west"}}
For the above rules to work, all nodes must the started with a system property called "zone"
Distribute Replicas Equally in Each Zone
For each shard of each collection, distribute replicas equally across the east
and west
zones.
{"replica": "#EQUAL", "shard": "#EACH", "nodeset":[{"sysprop.zone": "east"},{"sysprop.zone": "west"}]}}
Place Replicas Based on Node Role
Do not place any replica on any node that has the overseer role. Note that the role is added by the addRole
collection API. It is not automatically the node which is currently the overseer.
{"replica": 0, "put" :"on-each-node", "nodeset":{ "nodeRole": "overseer"}}
Place Replicas Based on Free Disk
Place all replicas in nodes where freedisk is greater than 500GB.
{"replica": "#ALL", "nodeset":{ "freedisk": ">500"}}
Keep all replicas in nodes where freedisk percentage is greater than 50%
.
{"replica": "#ALL", "nodeset":{"freedisk": ">50%"}}
Try to Place Replicas Based on Free Disk
When possible, place all replicas in nodes where freedisk is greater than 500GB. Here we use the strict
attribute to signal that this rule is to be honored on a best effort basis.
{"replica": "#ALL", "nodeset":{ "freedisk": ">500"}, "strict": false}
Place All Replicas of Type TLOG on Nodes with SSD Drives
{"replica": "#ALL", "type": "TLOG", "nodeset": {"diskType": "ssd"}}
Place All Replicas of Type PULL on Nodes with Rotational Disk Drives
{"replica": "#ALL", "type": "PULL", "nodeset" : {"diskType": "rotational"}}
Defining Collection-Specific Policies
By default, the cluster policy, if it exists, is used automatically for all collections in the cluster. However, we can create named policies that can be attached to a collection at the time of its creation by specifying the policy name along with a policy
parameter.
When a collection-specific policy is used, the rules in that policy are appended to the rules in the cluster policy and the combination of both are used. Therefore, it is recommended that you do not add rules to collection-specific policy that conflict with the ones in the cluster policy. Doing so will disqualify all nodes in the cluster from matching all criteria and make the policy useless.
It is possible to override rules specified in the cluster policy using collection-specific policy. For example, if a rule {replica:'<3', node:'#ANY'}
is present in the cluster policy and the collection-specific policy has a rule {replica:'<4', node:'#ANY'}
, the cluster policy is ignored in favor of the collection policy.
Also, if maxShardsPerNode
is specified during the time of collection creation, then both maxShardsPerNode
and the policy rules must be satisfied.
Some attributes such as cores
can only be used in the cluster policy. See the section Policy Rule Attributes for details.
To create a new named policy, use the set-policy
API. Once you have a named policy, you can specify the policy=<policy_name>
parameter to the CREATE command of the Collection API:
/admin/collections?action=CREATE&name=coll1&numShards=1&replicationFactor=2&policy=policy1
The above CREATE collection command will associate a policy named policy1
with the collection named coll1
. Only a single policy may be associated with a collection.
Example: Manual Collection Creation with a Policy
The starting state for this example is a Solr cluster with 3 nodes: "nodeA", "nodeB", and "nodeC". An existing 2-shard FirstCollection
with a replicationFactor
of 1 has one replica on "nodeB" and one on "nodeC". The default Autoscaling preferences are in effect:
[ {"minimize": "cores"} ]
The configured policy rule allows at most 1 core per node:
[ {"cores": "<2", "node": "#ANY"} ]
We now issue a CREATE command for a SecondCollection
with two shards and a replicationFactor
of 1:
http://localhost:8983/solr/admin/collections?action=CREATE&name=SecondCollection&numShards=2&replicationFactor=1
For each of the two replicas to be created, each Solr node is tested, in order from least to most loaded: would all policy rules be satisfied if a replica were placed there using an ADDREPLICA sub-command?
- ADDREPLICA for
shard1
: According to the Autoscaling preferences, the least loaded node is the one with the fewest cores: "nodeA", because it hosts no cores, while the other two nodes each host one core. The test to place a replica here succeeds, because doing so causes no policy violations, since the core count after adding the replica would not exceed the configured maximum of 1. Because "nodeA" can host the first shard’s replica, Solr skips testing of the other two nodes. - ADDREPLICA for
shard2
: After placing theshard1
replica, all nodes would be equally loaded, since each would have one core. The test to place theshard2
replica fails on each node, because placement would push the node over its maximum core count. This causes a policy violation.
Since there is no node that can host a replica for shard2
without causing a violation, the overall CREATE command fails. Let’s try again after increasing the maximum core count on all nodes to 2:
[ {"cores": "<3", "node": "#ANY"} ]
After re-issuing the SecondCollection
CREATE command, the replica for shard1
will be placed on "nodeA": it’s least loaded, so is tested first, and no policy violation will result from placement there. The shard2
replica could be placed on any of the 3 nodes, since they’re all equally loaded, and the chosen node will remain below its maximum core count after placement. The CREATE command succeeds.
Testing Autoscaling Configuration and Suggestions
It’s not always easy to predict the impact of autoscaling configuration changes on the cluster layout. Starting with release 8.1 Solr provides a tool for assessing the impact of such changes without affecting the state of the target cluster.
This testing tool is a part of bin/solr autoscaling
command. In addition to other
options that provide detailed status of the current cluster layout the following options
specifically allow users to test new autoscaling configurations and run "what if" scenarios:
-a <CONFIG>
- JSON file containing autoscaling configuration to test. This file needs to be in the same
format as the result of the
/solr/admin/autoscaling
call. If this parameter is missing then the currently deployed autoscaling configuration is used. -simulate
- Simulate the effects of applying all autoscaling suggestions on the cluster layout. NOTE: this does not affect in any way the actual cluster - this option uses the simulation framework to calculate the new layout without actually making the changes. Calculations are performed in the tool’s JVM so they don’t affect the performance of the running cluster either. This process is repeated several times until a limit is reached or there are no more suggestions left to apply (although unresolved violations may still remain!)
-i <NUMBER>
- Number of iterations of the simulation loop. Default is 10.
Results of the simulation contain the initial suggestions, suggestions at each step of the simulation and the final simulated state of the cluster.