public interface ExtractingParams
| Modifier and Type | Field and Description | 
|---|---|
| static String | BOOST_PREFIXThe boost value for the name of the field. | 
| static String | CAPTURE_ATTRIBUTESCapture attributes separately according to the name of the element, instead of just adding them to the string buffer | 
| static String | CAPTURE_ELEMENTSCapture the specified fields (and everything included below it that isn't capture by some other capture field) separately from the default. | 
| static String | DEFAULT_FIELDOptional. | 
| static String | EXTRACT_FORMATContent output format if extractOnly is true. | 
| static String | EXTRACT_ONLYOnly extract and return the content, do not index it. | 
| static String | IGNORE_TIKA_EXCEPTIONif true, ignore TikaException (give up to extract text but index meta data) | 
| static String | LITERALS_OVERRIDELiteral field values will by default override other values such as metadata and content. | 
| static String | LITERALS_PREFIXPass in literal values to be added to the document, as in | 
| static String | LOWERNAMESMap all generated attribute names to field names with lowercase and underscores. | 
| static String | MAP_PREFIXThe param prefix for mapping Tika metadata to Solr fields. | 
| static String | PASSWORD_MAP_FILEOptional. | 
| static String | RESOURCE_NAMEOptional. | 
| static String | RESOURCE_PASSWORDOptional. | 
| static String | STREAM_TYPEThe type of the stream. | 
| static String | UNKNOWN_FIELD_PREFIXOptional. | 
| static String | XPATH_EXPRESSIONRestrict the extracted parts of a document to be indexed
  by passing in an XPath expression. | 
static final String LOWERNAMES
static final String IGNORE_TIKA_EXCEPTION
static final String MAP_PREFIX
fmap.title=solr.titleIn this example, the tika "title" metadata value will be added to a Solr field named "solr.title"
static final String BOOST_PREFIX
map.title=solr.title boost.solr.title=2.5will boost the solr.title field for this document by 2.5
static final String LITERALS_PREFIX
literal.myField=Foo
static final String XPATH_EXPRESSION
SolrContentHandler.
 
 See Tika's docs for what the extracted document looks like.
 CAPTURE_ELEMENTS, 
Constant Field Valuesstatic final String EXTRACT_ONLY
static final String EXTRACT_FORMAT
static final String CAPTURE_ATTRIBUTES
static final String LITERALS_OVERRIDE
static final String CAPTURE_ELEMENTS
SolrContentHandler
 by Tika, not to be confused by the mapped field.  The field name can then
 be mapped into the index schema.
 
 For instance, a Tika document may look like:
 
  <html>
    ...
    <body>
      <p>some text here.  <div>more text</div></p>
      Some more text
    </body>
 
 By passing in the p tag, you could capture all P tags separately from the rest of the t
 Thus, in the example, the capture of the P tag would be: "some text here.  more text"static final String STREAM_TYPE
static final String RESOURCE_NAME
static final String RESOURCE_PASSWORD
static final String UNKNOWN_FIELD_PREFIX
static final String DEFAULT_FIELD
static final String PASSWORD_MAP_FILE
File format is Java properties format with one key=value per line. The key is evaluated as a regex against the file name, and the value is the password The rules are evaluated top-bottom, i.e. the first match will be used If you want a fallback password to be always used, supply a .*=<defaultmypassword> at the end
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.