Alias Management
A collection alias is a virtual collection which Solr treats the same as a normal collection. The alias collection may point to one or more real collections.
Some use cases for collection aliasing:
-
Time series data
-
Reindexing content behind the scenes
For an overview of aliases in Solr, see the section Aliases.
CREATEALIAS: Create or Modify an Alias for a Collection
The CREATEALIAS
action will create a new alias pointing to one or more collections.
Aliases come in 2 flavors: standard and routed.
Standard aliases are simple: CREATEALIAS
registers the alias name with the names of one or more collections provided by the command.
If an existing alias exists, it is replaced/updated.
A standard alias can serve as a means to rename a collection, and can be used to atomically swap which backing/underlying collection is "live" for various purposes.
When Solr searches an alias pointing to multiple collections, Solr will search all shards of all the collections as an aggregated whole. While it is possible to send updates to an alias spanning multiple collections, standard aliases have no logic for distributing documents among the referenced collections so all updates will go to the first collection in the list.
/admin/collections?action=CREATEALIAS&name=name&collections=collectionlist
Routed aliases are aliases with additional capabilities to act as a kind of super-collection that route updates to the correct collection.
Routing is data driven and may be based on a temporal field or on categories specified in a field (normally string based). See Routed Aliases for some important high-level information before getting started.
$ http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=timedata&router.start=NOW/DAY&router.field=evt_dt&router.name=time&router.interval=%2B1DAY&router.maxFutureMs=3600000&create-collection.collection.configName=myConfig&create-collection.numShards=2
If run on Jan 15, 2018, the above will create an time routed alias named timedata, that contains collections with names prefixed with timedata
and an initial collection named timedata_2018_01_15
will be created immediately.
Updates sent to this alias with a (required) value in evt_dt
that is before or after 2018-01-15 will be rejected, until the last 60 minutes of 2018-01-15.
After 2018-01-15T23:00:00 documents for either 2018-01-15 or 2018-01-16 will be accepted.
As soon as the system receives a document for an allowable time window for which there is no collection it will automatically create the next required collection (and potentially any intervening collections if router.interval
is
smaller than router.maxFutureMs
).
Both the initial collection and any subsequent collections will be created using
the specified configset.
All collection creation parameters other than name
are allowed, prefixed
by create-collection.
This means that one could, for example, partition their collections by day, and within each daily collection route the data to shards based on customer id. Such shards can be of any type (NRT, PULL or TLOG), and rule-based replica placement strategies may also be used.
The values supplied in this command for collection creation will be retained
in alias properties, and can be verified by inspecting aliases.json
in ZooKeeper.
Only updates are routed and queries are distributed to all collections in the alias. |
CREATEALIAS Parameters
name
-
Required
Default: none
The alias name to be created. If the alias is to be routed it also functions as a prefix for the names of the dependent collections that will be created. It must therefore adhere to normal requirements for collection naming.
async
-
Optional
Default: none
Request ID to track this action which will be processed asynchronously.
Standard Alias Parameters
collections
-
Optional
Default: none
A comma-separated list of collections to be aliased. The collections must already exist in the cluster. This parameter signals the creation of a standard alias. If it is present all routing parameters are prohibited. If routing parameters are present this parameter is prohibited.
Routed Alias Parameters
Most routed alias parameters become alias properties that can subsequently be inspected and modified.
router.name
-
Required
Default: none
The type of routing to use. Presently only
time
andcategory
andDimensional[]
are valid.In the case of a multi-dimensional routed alias (aka "DRA"), it is required to express all the dimensions in the same order that they will appear in the dimension array. The format for a DRA
router.name
isDimensional[dim1,dim2]
wheredim1
anddim2
are validrouter.name
values for each sub-dimension. Note that DRA’s are very new, and only 2D DRA’s are presently supported. Higher numbers of dimensions will be supported soon. See examples below for further clarification on how to configure individual dimensions. router.field
-
Required
Default: none
The field to inspect to determine which underlying collection an incoming document should be routed to. This field is required on all incoming documents.
create-collection.*
-
Optional
Default: none
The
*
wildcard can be replaced with any parameter from the CREATE command exceptname
. All other fields are identical in requirements and naming except that we insist that the configset be explicitly specified. The configset must be created beforehand, either uploaded or copied and modified. It’s probably a bad idea to use "data driven" mode as schema mutations might happen concurrently leading to errors.
Time Routed Alias Parameters
router.start
-
Required
Default: none
The start date/time of data for this time routed alias in Solr’s standard date/time format (i.e., ISO-8601 or "NOW" optionally with date math).
The first collection created for the alias will be internally named after this value. If a document is submitted with an earlier value for
router.field
then the earliest collection the alias points to then it will yield an error since it can’t be routed. This date/time MUST NOT have a milliseconds component other than 0. Particularly, this meansNOW
will fail 999 times out of 1000, thoughNOW/SECOND
,NOW/MINUTE
, etc., will work just fine. TZ
-
Optional
Default:
UTC
The timezone to be used when evaluating any date math in
router.start
orrouter.interval
. This is equivalent to the same parameter supplied to search queries, but understand in this case it’s persisted with most of the other parameters as an alias property.If GMT-4 is supplied for this value then a document dated 2018-01-14T21:00:00:01.2345Z would be stored in the myAlias_2018-01-15_01 collection (assuming an interval of +1HOUR).
router.interval
-
Required
Default: none
A date math expression that will be appended to a timestamp to determine the next collection in the series. Any date math expression that can be evaluated if appended to a timestamp of the form 2018-01-15T16:17:18 will work here.
router.maxFutureMs
-
Optional
Default:
600000
millisecondsThe maximum milliseconds into the future that a document is allowed to have in
router.field
for it to be accepted without error. If there was no limit, then an erroneous value could trigger many collections to be created. router.preemptiveCreateMath
-
Optional
Default: none
A date math expression that results in early creation of new collections.
If a document arrives with a timestamp that is after the end time of the most recent collection minus this interval, then the next (and only the next) collection will be created asynchronously.
Without this setting, collections are created synchronously when required by the document time stamp and thus block the flow of documents until the collection is created (possibly several seconds). Preemptive creation reduces these hiccups. If set to enough time (perhaps an hour or more) then if there are problems creating a collection, this window of time might be enough to take corrective action. However, after a successful preemptive creation the collection is consuming resources without being used, and new documents will tend to be routed through it only to be routed elsewhere.
Also, note that
router.autoDeleteAge
is currently evaluated relative to the date of a newly created collection, so you may want to increase the delete age by the preemptive window amount so that the oldest collection isn’t deleted too soon.It must be possible to subtract the interval specified from a date, so if prepending a minus sign creates invalid date math, this will cause an error. Also note that a document that is itself destined for a collection that does not exist will still trigger synchronous creation up to that destination collection but will not trigger additional async preemptive creation. Only one type of collection creation can happen per document. Example:
90MINUTES
.This property is empty by default indicating just-in-time, synchronous creation of new collections.
router.autoDeleteAge
-
Optional
Default: none
A date math expression that results in the oldest collections getting deleted automatically.
The date math is relative to the timestamp of a newly created collection (typically close to the current time), and thus this must produce an earlier time via rounding and/or subtracting. Collections to be deleted must have a time range that is entirely before the computed age. Collections are considered for deletion immediately prior to new collections getting created. Example:
/DAY-90DAYS
.The default is not to delete.
Category Routed Alias Parameters
router.maxCardinality
-
Optional
Default: none
The maximum number of categories allowed for this alias. This setting safeguards against the inadvertent creation of an infinite number of collections in the event of bad data.
router.mustMatch
-
Optional
Default: none
A regular expression that the value of the field specified by
router.field
must match before a corresponding collection will be created. Changing this setting after data has been added will not alter the data already indexed.Any valid Java regular expression pattern may be specified. This expression is pre-compiled at the start of each request so batching of updates is strongly recommended. Overly complex patterns will produce CPU or garbage collection overhead during indexing as determined by the JVM’s implementation of regular expressions.
Dimensional Routed Alias Parameters
router.#.
-
Optional
Default: none
This prefix denotes which position in the dimension array is being referred to for purposes of dimension configuration.
For example in a
Dimensional[time,category]
alias,router.0.start
would be used to set the start time for the time dimension.
CREATEALIAS Response
The output will simply be a responseHeader with details of the time it took to process the request.
To confirm the creation of the alias, you can look in the Solr Admin UI, under the Cloud section and find the aliases.json
file.
The initial collection for routed aliases should also be visible in various parts of the admin UI.
Examples using CREATEALIAS
Create an alias named "testalias" and link it to the collections named "foo" and "bar".
V1 API
Input
http://localhost:8983/solr/admin/collections?action=CREATEALIAS&name=testalias&collections=foo,bar&wt=xml
Output
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">122</int>
</lst>
</response>
V2 API Input
curl -X POST http://localhost:8983/api/collections -H 'Content-Type: application/json' -d '
{
"create-alias":{
"name":"testalias",
"collections":["foo","bar"]
}
}
'
Output
{
"responseHeader": {
"status": 0,
"QTime": 125
}
}
A somewhat contrived example demonstrating creating a TRA with many additional collection creation options.
V1 API
Input
http://localhost:8983/solr/admin/collections?action=CREATEALIAS
&name=somethingTemporalThisWayComes
&router.name=time
&router.start=NOW/MINUTE
&router.field=evt_dt
&router.interval=%2B2HOUR
&router.maxFutureMs=14400000
&create-collection.collection.configName=_default
&create-collection.router.name=implicit
&create-collection.router.field=foo_s
&create-collection.numShards=3
&create-collection.shards=foo,bar,baz
&create-collection.tlogReplicas=1
&create-collection.pullReplicas=1
&create-collection.property.foobar=bazbam
&wt=xml
Output
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1234</int>
</lst>
</response>
V2 API
Input
curl -X POST http://localhost:8983/api/collections -H 'Content-Type: application/json' -d '
{
"create-alias" : {
"name": "somethingTemporalThisWayComes",
"router" : {
"name": "time",
"field": "evt_dt",
"start":"NOW/MINUTE",
"interval":"+2HOUR",
"maxFutureMs":"14400000"
},
"create-collection" : {
"config":"_default",
"router": {
"name":"implicit",
"field":"foo_s"
},
"shards":"foo,bar,baz",
"numShards": 3,
"tlogReplicas":1,
"pullReplicas":1,
"properties" : {
"foobar":"bazbam"
}
}
}
}
'
Output
{
"responseHeader": {
"status": 0,
"QTime": 1234
}
}
Another example, this time of a Dimensional Routed Alias demonstrating how to specify parameters for the individual dimensions
V1 API
Input
http://localhost:8983/solr/admin/collections?action=CREATEALIAS
&name=dra_test1
&router.name=Dimensional[time,category]
&router.0.start=2019-01-01T00:00:00Z
&router.0.field=myDate_tdt
&router.0.interval=%2B1MONTH
&router.0.maxFutureMs=600000
&create-collection.collection.configName=_default
&create-collection.numShards=2
&router.1.maxCardinality=20
&router.1.field=myCategory_s
&wt=xml
Output
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">1234</int>
</lst>
</response>
V2 API
Input
curl -X POST http://localhost:8983/api/collections -H 'Content-Type: application/json' -d '
{
"create-alias":{
"name":"dra_test1",
"router": {
"name": "Dimensional[time,category]",
"routerList" : [ {
"field":"myDate_tdt",
"start":"2019-01-01T00:00:00Z",
"interval":"+1MONTH",
"maxFutureMs":600000
},{
"field":"myCategory_s",
"maxCardinality":20
}]
},
"create-collection": {
"config":"_default",
"numShards":2
}
}
}
'
Output
{
"responseHeader": {
"status": 0,
"QTime": 1234
}
}
LISTALIASES: List of all aliases in the cluster
V1 API
http://localhost:8982/solr/admin/collections?action=LISTALIASES
V2 API
curl -X GET http://localhost:8983/api/cluster/aliases
The LISTALIASES action does not take any parameters.
LISTALIASES Response
The output will contain a list of aliases with the corresponding collection names.
Examples using LISTALIASES
Input
List the existing aliases, requesting information as XML from Solr:
http://localhost:8983/solr/admin/collections?action=LISTALIASES&wt=xml
Output
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
<lst name="aliases">
<str name="testalias1">collection1</str>
<str name="testalias2">collection1,collection2</str>
</lst>
<lst name="properties">
<lst name="testalias1"/>
<lst name="testalias2">
<str name="someKey">someValue</str>
</lst>
</lst>
</response>
ALIASPROP: Modify Alias Properties for a Collection
The ALIASPROP
action modifies the properties (metadata) on an alias.
If a key is set with a value that is empty it will be removed.
V1 API
http://localhost:8983/admin/collections?action=ALIASPROP&name=techproducts_alias&property.foo=bar
V2 API
curl -X POST http://localhost:8983/api/collections -H 'Content-Type: application/json' -d '
{
"set-alias-property":{
"name":"techproducts_alias",
"properties": {"foo":"bar"}
}
}
'
This command allows you to revise any property. No alias specific validation is performed. Routed aliases may cease to function, function incorrectly, or cause errors if property values are set carelessly. |
ALIASPROP Parameters
name
-
Required
Default: none
The alias name on which to set properties.
property.name=value
(v1)-
Optional
Default: none
Set property name to value.
"properties":{"name":"value"}
(v2)-
Optional
Default: none
A dictionary of name/value pairs of properties to be set.
async
-
Optional
Default: none
Request ID to track this action which will be processed asynchronously.
DELETEALIAS: Delete a Collection Alias
V1 API
http://localhost:8983/solr/admin/collections?action=DELETEALIAS&name=testalias
V2 API
curl -X POST http://localhost:8983/api/collections -H 'Content-Type: application/json' -d '
{
"delete-alias":{
"name":"testalias"
}
}
'
DELETEALIAS Parameters
name
-
Required
Default: none
The name of the alias to delete.
async
-
Optional
Default: none
Request ID to track this action which will be processed asynchronously.