Package org.apache.solr.schema
Class SimplePreAnalyzedParser
- java.lang.Object
- 
- org.apache.solr.schema.SimplePreAnalyzedParser
 
- 
- All Implemented Interfaces:
- PreAnalyzedField.PreAnalyzedParser
 
 public final class SimplePreAnalyzedParser extends Object implements PreAnalyzedField.PreAnalyzedParser Simple plain text format parser forPreAnalyzedField.Serialization formatThe format of the serialization is as follows: content ::= version (stored)? tokens version ::= digit+ " " ; stored field value - any "=" inside must be escaped! stored ::= "=" text "=" tokens ::= (token ((" ") + token)*)* token ::= text ("," attrib)* attrib ::= name '=' value name ::= text value ::= textSpecial characters in "text" values can be escaped using the escape character \ . The following escape sequences are recognized: "\ " - literal space character "\," - literal , character "\=" - literal = character "\\" - literal \ character "\n" - newline "\r" - carriage return "\t" - horizontal tab Please note that Unicode sequences (e.g. \u0001) are not supported.Supported attribute namesThe following token attributes are supported, and identified with short symbolic names:i - position increment (integer) s - token offset, start position (integer) e - token offset, end position (integer) t - token type (string) f - token flags (hexadecimal integer) p - payload (bytes in hexadecimal format; whitespace is ignored) Token offsets are tracked and implicitly added to the token stream - the start and end offsets consider only the term text and whitespace, and exclude the space taken by token attributes.Example token streams1 one two three - version 1 - stored: 'null' - tok: '(term=one,startOffset=0,endOffset=3)' - tok: '(term=two,startOffset=4,endOffset=7)' - tok: '(term=three,startOffset=8,endOffset=13)' 1 one two three - version 1 - stored: 'null' - tok: '(term=one,startOffset=0,endOffset=3)' - tok: '(term=two,startOffset=5,endOffset=8)' - tok: '(term=three,startOffset=11,endOffset=16)' 1 one,s=123,e=128,i=22 two three,s=20,e=22 - version 1 - stored: 'null' - tok: '(term=one,positionIncrement=22,startOffset=123,endOffset=128)' - tok: '(term=two,positionIncrement=1,startOffset=5,endOffset=8)' - tok: '(term=three,positionIncrement=1,startOffset=20,endOffset=22)' 1 \ one\ \,,i=22,a=\, two\= \n,\ =\ \ - version 1 - stored: 'null' - tok: '(term= one ,,positionIncrement=22,startOffset=0,endOffset=6)' - tok: '(term=two= ,positionIncrement=1,startOffset=7,endOffset=15)' - tok: '(term=\,positionIncrement=1,startOffset=17,endOffset=18)' 1 ,i=22 ,i=33,s=2,e=20 , - version 1 - stored: 'null' - tok: '(term=,positionIncrement=22,startOffset=0,endOffset=0)' - tok: '(term=,positionIncrement=33,startOffset=2,endOffset=20)' - tok: '(term=,positionIncrement=1,startOffset=2,endOffset=2)' 1 =This is the stored part with \= \n \t escapes.=one two three - version 1 - stored: 'This is the stored part with = \n \t escapes.' - tok: '(term=one,startOffset=0,endOffset=3)' - tok: '(term=two,startOffset=4,endOffset=7)' - tok: '(term=three,startOffset=8,endOffset=13)' 1 == - version 1 - stored: '' - (no tokens) 1 =this is a test.= - version 1 - stored: 'this is a test.' - (no tokens) 
- 
- 
Constructor SummaryConstructors Constructor Description SimplePreAnalyzedParser()
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description PreAnalyzedField.ParseResultparse(Reader reader, org.apache.lucene.util.AttributeSource parent)Parse input.StringtoFormattedString(org.apache.lucene.document.Field f)Format a field so that the resulting String is valid for parsing withPreAnalyzedField.PreAnalyzedParser.parse(Reader, AttributeSource).
 
- 
- 
- 
Method Detail- 
parsepublic PreAnalyzedField.ParseResult parse(Reader reader, org.apache.lucene.util.AttributeSource parent) throws IOException Description copied from interface:PreAnalyzedField.PreAnalyzedParserParse input.- Specified by:
- parsein interface- PreAnalyzedField.PreAnalyzedParser
- Parameters:
- reader- input to read from
- parent- parent who will own the resulting states (tokens with attributes)
- Returns:
- parse result, with possibly null stored and/or states fields.
- Throws:
- IOException- if a parsing error or IO error occurs
 
 - 
toFormattedStringpublic String toFormattedString(org.apache.lucene.document.Field f) throws IOException Description copied from interface:PreAnalyzedField.PreAnalyzedParserFormat a field so that the resulting String is valid for parsing withPreAnalyzedField.PreAnalyzedParser.parse(Reader, AttributeSource).- Specified by:
- toFormattedStringin interface- PreAnalyzedField.PreAnalyzedParser
- Parameters:
- f- field instance
- Returns:
- formatted string
- Throws:
- IOException- If there is a low-level I/O error.
 
 
- 
 
-