ExtractingParams (Solr 4.1.0 API)

All Known Implementing Classes:

SolrContentHandler
```
public interface ExtractingParams
```
The various Solr Parameters names to use when extracting content.

Field Summary

Fields
Modifier and Type	Field and Description
`static String`	`BOOST_PREFIX` The boost value for the name of the field.
`static String`	`CAPTURE_ATTRIBUTES` Capture attributes separately according to the name of the element, instead of just adding them to the string buffer
`static String`	`CAPTURE_ELEMENTS` Capture the specified fields (and everything included below it that isn't capture by some other capture field) separately from the default.
`static String`	`DEFAULT_FIELD` Optional.
`static String`	`EXTRACT_FORMAT` Content output format if extractOnly is true.
`static String`	`EXTRACT_ONLY` Only extract and return the content, do not index it.
`static String`	`IGNORE_TIKA_EXCEPTION` if true, ignore TikaException (give up to extract text but index meta data)
`static String`	`LITERALS_OVERRIDE` Literal field values will by default override other values such as metadata and content.
`static String`	`LITERALS_PREFIX` Pass in literal values to be added to the document, as in
`static String`	`LOWERNAMES` Map all generated attribute names to field names with lowercase and underscores.
`static String`	`MAP_PREFIX` The param prefix for mapping Tika metadata to Solr fields.
`static String`	`PASSWORD_MAP_FILE` Optional.
`static String`	`RESOURCE_NAME` Optional.
`static String`	`RESOURCE_PASSWORD` Optional.
`static String`	`STREAM_TYPE` The type of the stream.
`static String`	`UNKNOWN_FIELD_PREFIX` Optional.
`static String`	`XPATH_EXPRESSION` Restrict the extracted parts of a document to be indexed by passing in an XPath expression.

- Field Detail
  - LOWERNAMES
```
static final String LOWERNAMES
```
    Map all generated attribute names to field names with lowercase and underscores.
    
    See Also:
    Constant Field Values
  - IGNORE_TIKA_EXCEPTION
```
static final String IGNORE_TIKA_EXCEPTION
```
    if true, ignore TikaException (give up to extract text but index meta data)
    
    See Also:
    Constant Field Values
  - MAP_PREFIX
```
static final String MAP_PREFIX
```
    The param prefix for mapping Tika metadata to Solr fields.
    To map a field, add a name like:
```
fmap.title=solr.title
```
    In this example, the tika "title" metadata value will be added to a Solr field named "solr.title"
    See Also:
    Constant Field Values
  - BOOST_PREFIX
```
static final String BOOST_PREFIX
```
    The boost value for the name of the field. The boost can be specified by a name mapping.
    For example
```
 map.title=solr.title
 boost.solr.title=2.5
 
```
    will boost the solr.title field for this document by 2.5
    See Also:
    Constant Field Values
  - LITERALS_PREFIX
```
static final String LITERALS_PREFIX
```
    Pass in literal values to be added to the document, as in
```
  literal.myField=Foo 
 
```
    See Also:
    Constant Field Values
  - XPATH_EXPRESSION
```
static final String XPATH_EXPRESSION
```
    Restrict the extracted parts of a document to be indexed by passing in an XPath expression. All content that satisfies the XPath expr. will be passed to the SolrContentHandler.
    See Tika's docs for what the extracted document looks like.
    
    See Also:
    CAPTURE_ELEMENTS, Constant Field Values
  - EXTRACT_ONLY
```
static final String EXTRACT_ONLY
```
    Only extract and return the content, do not index it.
    
    See Also:
    Constant Field Values
  - EXTRACT_FORMAT
```
static final String EXTRACT_FORMAT
```
    Content output format if extractOnly is true. Default is "xml", alternative is "text".
    
    See Also:
    Constant Field Values
  - CAPTURE_ATTRIBUTES
```
static final String CAPTURE_ATTRIBUTES
```
    Capture attributes separately according to the name of the element, instead of just adding them to the string buffer
    
    See Also:
    Constant Field Values
  - LITERALS_OVERRIDE
```
static final String LITERALS_OVERRIDE
```
    Literal field values will by default override other values such as metadata and content. Set this to false to revert to pre-4.0 behaviour
    
    See Also:
    Constant Field Values
  - CAPTURE_ELEMENTS
```
static final String CAPTURE_ELEMENTS
```
    Capture the specified fields (and everything included below it that isn't capture by some other capture field) separately from the default. This is different then the case of passing in an XPath expression.
    The Capture field is based on the localName returned to the SolrContentHandler by Tika, not to be confused by the mapped field. The field name can then be mapped into the index schema.
    For instance, a Tika document may look like:
```
  <html>
    ...
    <body>
      <p>some text here.  <div>more text</div></p>
      Some more text
    </body>
 
```
    By passing in the p tag, you could capture all P tags separately from the rest of the t Thus, in the example, the capture of the P tag would be: "some text here. more text"
    See Also:
    Constant Field Values
  - STREAM_TYPE
```
static final String STREAM_TYPE
```
    The type of the stream. If not specified, Tika will use mime type detection.
    
    See Also:
    Constant Field Values
  - RESOURCE_NAME
```
static final String RESOURCE_NAME
```
    Optional. The file name. If specified, Tika can take this into account while guessing the MIME type.
    
    See Also:
    Constant Field Values
  - RESOURCE_PASSWORD
```
static final String RESOURCE_PASSWORD
```
    Optional. The password for this resource. Will be used instead of the rule based password lookup mechanisms
    
    See Also:
    Constant Field Values
  - UNKNOWN_FIELD_PREFIX
```
static final String UNKNOWN_FIELD_PREFIX
```
    Optional. If specified, the prefix will be prepended to all Metadata, such that it would be possible to setup a dynamic field to automatically capture it
    
    See Also:
    Constant Field Values
  - DEFAULT_FIELD
```
static final String DEFAULT_FIELD
```
    Optional. If specified and the name of a potential field cannot be determined, the default Field specified will be used instead.
    
    See Also:
    Constant Field Values
  - PASSWORD_MAP_FILE
```
static final String PASSWORD_MAP_FILE
```
    Optional. If specified, loads the file as a source for password lookups for Tika encrypted documents.
    File format is Java properties format with one key=value per line. The key is evaluated as a regex against the file name, and the value is the password The rules are evaluated top-bottom, i.e. the first match will be used If you want a fallback password to be always used, supply a .*=<defaultmypassword> at the end
    
    See Also:
    Constant Field Values

Interface ExtractingParams

Field Summary

Field Detail

LOWERNAMES

IGNORE_TIKA_EXCEPTION

MAP_PREFIX

BOOST_PREFIX

LITERALS_PREFIX

XPATH_EXPRESSION

EXTRACT_ONLY

EXTRACT_FORMAT

CAPTURE_ATTRIBUTES

LITERALS_OVERRIDE

CAPTURE_ELEMENTS

STREAM_TYPE

RESOURCE_NAME

RESOURCE_PASSWORD

UNKNOWN_FIELD_PREFIX

DEFAULT_FIELD

PASSWORD_MAP_FILE