Class SolrInputDocumentReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Readable

    public class SolrInputDocumentReader
    extends Reader
    Reader on top of SolrInputDocument that can "stream" a document as a character stream in a memory efficient way, to avoid potentially large intermediate string buffers containing whole document content.
    WARNING: This API is experimental and might change in incompatible ways in the next release.
    • Constructor Detail

      • SolrInputDocumentReader

        public SolrInputDocumentReader​(SolrInputDocument doc,
                                       int maxTotalChars,
                                       int maxCharsPerFieldValue)
        Creates a character-stream reader that streams all String fields in the document with space as separator
        Parameters:
        doc - Solr input document
        maxCharsPerFieldValue - max chars to consume per field value
        maxTotalChars - max chars to consume total
      • SolrInputDocumentReader

        public SolrInputDocumentReader​(SolrInputDocument doc,
                                       String[] fields,
                                       int maxTotalChars,
                                       int maxCharsPerFieldValue,
                                       String fieldValueSep)
        Creates a character-stream reader that reads the listed fields in order, with max lengths as specified.
        Parameters:
        doc - Solr input document
        fields - list of field names to include
        fieldValueSep - separator to insert between field values
        maxCharsPerFieldValue - max chars to consume per field value
        maxTotalChars - max chars to consume total
    • Method Detail

      • setEodReturnValue

        public void setEodReturnValue​(int eodReturnValue)
        Choose another return value than -1 for end of document reached. Warning: Only to work around buggy consumers such as LangDetect 1.1
        Parameters:
        eodReturnValue - integer which defaults to -1
      • asString

        public static String asString​(Reader reader)
        Gets the whole reader as a String
        Returns:
        string of concatenated fields