SolrCloud Autoscaling Trigger Actions
TriggerAction
implementations process events generated by triggers in order to ensure the cluster’s
health and good use of resources.
Currently two implementations are provided: ComputePlanAction
and ExecutePlanAction
.
Compute Plan Action
The ComputePlanAction
uses the policy and preferences to calculate the optimal set of Collection API
commands which can re-balance the cluster in response to trigger events.
The following parameters are configurable:
collections
- A comma-separated list of collection names. If this list is not empty then the computed operations will only calculate collection operations that affect listed collections and ignore any other collection operations for collections not listed here. Note that non-collection operations are not affected by this.
Example configuration:
{
"set-trigger" : {
"name" : "node_added_trigger",
"event" : "nodeAdded",
"waitFor" : "1s",
"enabled" : true,
"actions" : [
{
"name" : "compute_plan",
"class" : "solr.ComputePlanAction",
"collections" : "test1,test2",
},
{
"name" : "execute_plan",
"class" : "solr.ExecutePlanAction",
}
]
}
}
In this example only collections test1
and test2
will be potentially
replicated / moved to an added node, other collections will be ignored even
if they cause policy violations.
Execute Plan Action
The ExecutePlanAction
executes the Collection API commands emitted by the ComputePlanAction
against
the cluster using SolrJ. It executes the commands serially, waiting for each of them to succeed before
continuing with the next one.
Currently, it has the following configurable parameters:
taskTimeoutSeconds
- Default value of this parameter is 120 seconds. This value defines how long the action will wait for a
command to complete its execution. If a timeout is reached while the command is still running then
the command status is provisionally considered a success but a warning is logged, unless
taskTimeoutFail
is set to true. taskTimeoutFail
- Boolean with a default value of false. If this value is true then a timeout in command processing will be marked as failure and an exception will be thrown.
If the Overseer node fails while ExecutePlanAction
is running,
then the new Overseer node will run the chain of actions for the same event again after waiting for any
running Collection API operations belonging to the event to complete.
Please see SolrCloud Autoscaling Fault Tolerance for more details on fault tolerance within the autoscaling framework.