public class CSVParser extends Object
CSVStrategy
.
Parsing of a csv-string having tabs as separators, '"' as an optional value encapsulator, and comments starting with '#':
String[][] data = (new CSVParser(new StringReader("a\tb\nc\td"), new CSVStrategy('\t','"','#'))).getAllValues();
Parsing of a csv-string in Excel CSV format
String[][] data = (new CSVParser(new StringReader("a;b\nc;d"), CSVStrategy.EXCEL_STRATEGY)).getAllValues();
Internal parser state is completely covered by the strategy and the reader-state.
see package documentation for more details
Modifier and Type | Field and Description |
---|---|
protected static int |
TT_EOF
Token (which can have content) when end of file is reached.
|
protected static int |
TT_EORECORD
Token with content when end of a line is reached.
|
protected static int |
TT_INVALID
Token has no valid content, i.e.
|
protected static int |
TT_TOKEN
Token with content, at beginning or in the middle of a line.
|
Constructor and Description |
---|
CSVParser(Reader input)
CSV parser using the default
CSVStrategy . |
CSVParser(Reader input,
CSVStrategy strategy)
Customized CSV parser using the given
CSVStrategy |
Modifier and Type | Method and Description |
---|---|
String[][] |
getAllValues()
Parses the CSV according to the given strategy
and returns the content as an array of records
(whereas records are arrays of single values).
|
String[] |
getLine()
Parses from the current point in the stream til
the end of the current line.
|
int |
getLineNumber()
Returns the current line number in the input stream.
|
CSVStrategy |
getStrategy()
Obtain the specified CSV Strategy.
|
protected org.apache.solr.internal.csv.CSVParser.Token |
nextToken()
Convenience method for
nextToken(null) . |
protected org.apache.solr.internal.csv.CSVParser.Token |
nextToken(org.apache.solr.internal.csv.CSVParser.Token tkn)
Returns the next token.
|
String |
nextValue()
Parses the CSV according to the given strategy
and returns the next csv-value as string.
|
protected int |
unicodeEscapeLexer(int c)
Decodes Unicode escapes.
|
protected static final int TT_INVALID
protected static final int TT_TOKEN
protected static final int TT_EOF
protected static final int TT_EORECORD
public CSVParser(Reader input)
CSVStrategy
.input
- a Reader containing "csv-formatted" inputpublic CSVParser(Reader input, CSVStrategy strategy)
CSVStrategy
input
- a Reader containing "csv-formatted" inputstrategy
- the CSVStrategy used for CSV parsingpublic String[][] getAllValues() throws IOException
The returned content starts at the current parse-position in the stream.
IOException
- on parse error or input read-failurepublic String nextValue() throws IOException
IOException
- on parse error or input read-failurepublic String[] getLine() throws IOException
IOException
- on parse error or input read-failurepublic int getLineNumber()
protected org.apache.solr.internal.csv.CSVParser.Token nextToken() throws IOException
nextToken(null)
.IOException
protected org.apache.solr.internal.csv.CSVParser.Token nextToken(org.apache.solr.internal.csv.CSVParser.Token tkn) throws IOException
tkn
- an existing Token object to reuse. The caller is responsible to initialize the
Token.IOException
- on stream access errorprotected int unicodeEscapeLexer(int c) throws IOException
c
- current char which is discarded because it's the "\\" of "\\uXXXX"IOException
- on wrong unicode escape sequence or read errorpublic CSVStrategy getStrategy()
Copyright © 2000-2017 Apache Software Foundation. All Rights Reserved.