Package org.apache.solr.handler.tagger
Class XmlOffsetCorrector
- java.lang.Object
-
- org.apache.solr.handler.tagger.OffsetCorrector
-
- org.apache.solr.handler.tagger.XmlOffsetCorrector
-
public class XmlOffsetCorrector extends OffsetCorrector
Corrects offsets to adjust for XML formatted data. The goal is such that the caller should be able to insert a start XML tag at the start offset and a corresponding end XML tag at the end offset of the tagger, and have it be valid XML. SeeOffsetCorrector.correctPair(int, int)
.This will not work on invalid XML.
Not thread-safe.
-
-
Field Summary
-
Fields inherited from class org.apache.solr.handler.tagger.OffsetCorrector
docText, nonTaggableOffsets, offsetPair, parentChangeIds, parentChangeOffsets, tagInfo
-
-
Constructor Summary
Constructors Constructor Description XmlOffsetCorrector(String docText)
Initialize based on the document text.
-
Method Summary
-
Methods inherited from class org.apache.solr.handler.tagger.OffsetCorrector
correctEndOffsetForCloseElement, correctPair, getCloseEndOff, getCloseStartOff, getOpenEndOff, getOpenStartOff, getParentTag, hasNonWhitespace, lookupTag, spansNonTaggable, tagEnclosesOffset
-
-
-
-
Constructor Detail
-
XmlOffsetCorrector
public XmlOffsetCorrector(String docText) throws XMLStreamException
Initialize based on the document text.- Parameters:
docText
- non-null XML content.- Throws:
XMLStreamException
- If there's a problem parsing the XML.
-
-