Class SkipExistingDocumentsProcessorFactory
- java.lang.Object
-
- org.apache.solr.update.processor.UpdateRequestProcessorFactory
-
- org.apache.solr.update.processor.SkipExistingDocumentsProcessorFactory
-
- All Implemented Interfaces:
UpdateRequestProcessorFactory.RunAlways
,NamedListInitializedPlugin
,SolrCoreAware
public class SkipExistingDocumentsProcessorFactory extends UpdateRequestProcessorFactory implements SolrCoreAware, UpdateRequestProcessorFactory.RunAlways
This Factory generates an UpdateProcessor that will (by default) skip inserting new documents if there already exists a document with the same uniqueKey value in the index. It will also skip Atomic Updates to a document if that document does not already exist. This behaviour is applied to each document in turn, so adding a batch of documents can result in some being added and some ignored, depending on what is already in the index. If all of the documents are skipped, no changes to the index will occur.
These two forms of skipping can be switched on or off independently, by using init params:skipInsertIfExists
- This boolean parameter defaults totrue
, but if set tofalse
then inserts (i.e. not Atomic Updates) will be passed through unchanged even if the document already exists.skipUpdateIfMissing
- This boolean parameter defaults totrue
, but if set tofalse
then Atomic Updates will be passed through unchanged regardless of whether the document exists.
These params can also be specified per-request, to override the configured behaviour for specific updates e.g.
/update?skipUpdateIfMissing=true
This implementation is a simpler alternative to
DocBasedVersionConstraintsProcessorFactory
when you are not concerned with versioning, and just want to quietly ignore duplicate documents and/or silently skip updates to non-existent documents (in the same way a databaseUPDATE
would). If your documents do have an explicit version field, and you want to ensure older versions are skipped instead of replacing the indexed document, you should considerDocBasedVersionConstraintsProcessorFactory
instead.An example chain configuration to use this for skipping duplicate inserts, but not skipping updates to missing documents by default, is:
<updateRequestProcessorChain name="skipexisting"> <processor class="solr.LogUpdateProcessorFactory" /> <processor class="solr.SkipExistingDocumentsProcessorFactory"> <bool name="skipInsertIfExists">true</bool> <bool name="skipUpdateIfMissing">false</bool> <!-- Can override this per-request --> </processor> <processor class="solr.DistributedUpdateProcessorFactory" /> <processor class="solr.RunUpdateProcessorFactory" /> </updateRequestProcessorChain>
- Since:
- 6.4.0
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.solr.update.processor.UpdateRequestProcessorFactory
UpdateRequestProcessorFactory.RunAlways
-
-
Constructor Summary
Constructors Constructor Description SkipExistingDocumentsProcessorFactory()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description org.apache.solr.update.processor.SkipExistingDocumentsProcessorFactory.SkipExistingDocumentsUpdateProcessor
getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
void
inform(SolrCore core)
void
init(NamedList args)
-
-
-
Method Detail
-
init
public void init(NamedList args)
- Specified by:
init
in interfaceNamedListInitializedPlugin
- Overrides:
init
in classUpdateRequestProcessorFactory
-
getInstance
public org.apache.solr.update.processor.SkipExistingDocumentsProcessorFactory.SkipExistingDocumentsUpdateProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp, UpdateRequestProcessor next)
- Specified by:
getInstance
in classUpdateRequestProcessorFactory
-
inform
public void inform(SolrCore core)
- Specified by:
inform
in interfaceSolrCoreAware
-
-