SolrCloud Autoscaling Trigger Actions
Autoscaling is deprecated
The autoscaling framework in its current form is deprecated and will be removed in Solr 9.0. A new design for this feature is currently under development in SOLR-14613 with a goal for release with Solr 9.0. |
TriggerAction
implementations process events generated by triggers in order to ensure the cluster’s
health and good use of resources.
Currently two implementations are provided: ComputePlanAction
and ExecutePlanAction
.
Compute Plan Action
The ComputePlanAction
uses the policy and preferences to calculate the optimal set of Collection API
commands which can re-balance the cluster in response to trigger events.
The following parameters are configurable:
collections
A comma-separated list of collection names, or a selector on collection properties that can be used to filter collections for which the plan is computed.
If a non-empty list or selector is specified then the computed operations will only calculate collection operations that affect matched collections and ignore any other collection operations for collections not listed here. This does not affect non-collection operations.
A collection selector is of the form
collections: {key1: value1, key2: value2, …}
where the key can be any collection property such asname
,policy
,numShards
, etc. The value must match exactly and all specified properties must match for a collection to match.A collection selector is useful in a cluster where collections are added and removed frequently and where selecting only collections that use a specific autoscaling policy is useful.
Example configurations:
{
"set-trigger" : {
"name" : "node_added_trigger",
"event" : "nodeAdded",
"waitFor" : "1s",
"enabled" : true,
"actions" : [
{
"name" : "compute_plan",
"class" : "solr.ComputePlanAction",
"collections" : "test1,test2"
},
{
"name" : "execute_plan",
"class" : "solr.ExecutePlanAction"
}
]
}
}
In this example only collections test1
and test2
will be potentially
replicated / moved to an added node, other collections will be ignored even
if they cause policy violations.
{
"set-trigger" : {
"name" : "node_added_trigger",
"event" : "nodeAdded",
"waitFor" : "1s",
"enabled" : true,
"actions" : [
{
"name" : "compute_plan",
"class" : "solr.ComputePlanAction",
"collections" : {"policy": "my_policy"}
},
{
"name" : "execute_plan",
"class" : "solr.ExecutePlanAction"
}
]
}
}
In this example only collections which use the my_policy
as their autoscaling policy will be potentially replicated / moved to an added node, other collections will be ignored even if they cause policy violations.
{
"set-trigger" : {
"name" : "node_added_trigger",
"event" : "nodeAdded",
"waitFor" : "1s",
"enabled" : true,
"actions" : [
{
"name" : "compute_plan",
"class" : "solr.ComputePlanAction",
"collections" : {"policy": "my_policy", "numShards" : "4"}
},
{
"name" : "execute_plan",
"class" : "solr.ExecutePlanAction"
}
]
}
}
In this example only collections which use the my_policy
as their autoscaling policy and that have numShards
equal to 4
will be potentially replicated / moved to an added node, other collections will be ignored even if they cause policy violations.
Execute Plan Action
The ExecutePlanAction
executes the Collection API commands emitted by the ComputePlanAction
against
the cluster using SolrJ. It executes the commands serially, waiting for each of them to succeed before
continuing with the next one.
Currently, it has the following configurable parameters:
taskTimeoutSeconds
- Default value of this parameter is 120 seconds. This value defines how long the action will wait for a
command to complete its execution. If a timeout is reached while the command is still running then
the command status is provisionally considered a success but a warning is logged, unless
taskTimeoutFail
is set to true. taskTimeoutFail
- Boolean with a default value of false. If this value is true then a timeout in command processing will be marked as failure and an exception will be thrown.
If the Overseer node fails while ExecutePlanAction
is running,
then the new Overseer node will run the chain of actions for the same event again after waiting for any
running Collection API operations belonging to the event to complete.
Please see SolrCloud Autoscaling Fault Tolerance for more details on fault tolerance within the autoscaling framework.