public class SkipExistingDocumentsProcessorFactory extends UpdateRequestProcessorFactory implements SolrCoreAware, UpdateRequestProcessorFactory.RunAlways
This Factory generates an UpdateProcessor that will (by default) skip inserting new documents if there already exists a document with the same uniqueKey value in the index. It will also skip Atomic Updates to a document if that document does not already exist. This behaviour is applied to each document in turn, so adding a batch of documents can result in some being added and some ignored, depending on what is already in the index. If all of the documents are skipped, no changes to the index will occur.
These two forms of skipping can be switched on or off independently, by using init params:skipInsertIfExists
- This boolean parameter defaults to
true
, but if set to false
then inserts (i.e. not Atomic Updates)
will be passed through unchanged even if the document already exists.skipUpdateIfMissing
- This boolean parameter defaults to
true
, but if set to false
then Atomic Updates
will be passed through unchanged regardless of whether the document exists.
These params can also be specified per-request, to override the configured behaviour
for specific updates e.g. /update?skipUpdateIfMissing=true
This implementation is a simpler alternative to DocBasedVersionConstraintsProcessorFactory
when you are not concerned with versioning, and just want to quietly ignore duplicate documents and/or
silently skip updates to non-existent documents (in the same way a database UPDATE
would).
If your documents do have an explicit version field, and you want to ensure older versions are
skipped instead of replacing the indexed document, you should consider DocBasedVersionConstraintsProcessorFactory
instead.
An example chain configuration to use this for skipping duplicate inserts, but not skipping updates to missing documents by default, is:
<updateRequestProcessorChain name="skipexisting"> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.SkipExistingDocumentsProcessorFactory"> <bool name="skipInsertIfExists">true</bool> <bool name="skipUpdateIfMissing">false</bool> <!-- Can override this per-request --> </processor> <processor class="solr.DistributedUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain>
UpdateRequestProcessorFactory.RunAlways
Constructor and Description |
---|
SkipExistingDocumentsProcessorFactory() |
Modifier and Type | Method and Description |
---|---|
org.apache.solr.update.processor.SkipExistingDocumentsProcessorFactory.SkipExistingDocumentsUpdateProcessor |
getInstance(SolrQueryRequest req,
SolrQueryResponse rsp,
UpdateRequestProcessor next) |
void |
inform(SolrCore core) |
void |
init(NamedList args) |
public SkipExistingDocumentsProcessorFactory()
public void init(NamedList args)
init
in interface NamedListInitializedPlugin
init
in class UpdateRequestProcessorFactory
public org.apache.solr.update.processor.SkipExistingDocumentsProcessorFactory.SkipExistingDocumentsUpdateProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
getInstance
in class UpdateRequestProcessorFactory
public void inform(SolrCore core)
inform
in interface SolrCoreAware
Copyright © 2000-2019 Apache Software Foundation. All Rights Reserved.