Content Streams

Content streams are bulk data passed with a request to Solr.

When Solr RequestHandlers are accessed using path based URLs, the SolrQueryRequest object containing the parameters of the request may also contain a list of ContentStreams containing bulk data for the request. (The name SolrQueryRequest is a bit misleading: it is involved in all requests, regardless of whether it is a query request or an update request.)

Content Stream Sources

Currently request handlers can get content streams in a variety of ways:

  • For multipart file uploads, each file is passed as a stream.
  • For POST requests where the content-type is not application/x-www-form-urlencoded, the raw POST body is passed as a stream. The full POST body is parsed as parameters and included in the Solr parameters.
  • The contents of parameter stream.body is passed as a stream.
  • If remote streaming is enabled and URL content is called for during request handling, the contents of each stream.url and stream.file parameters are fetched and passed as a stream.

By default, curl sends a contentType="application/x-www-form-urlencoded" header. If you need to test a SolrContentHeader content stream, you will need to set the content type with curl’s -H flag.

Remote Streaming

Remote streaming lets you send the contents of a URL as a stream to a given Solr RequestHandler. You could use remote streaming to send a remote or local file to an update plugin.

Remote streaming is disabled by default. Enabling it is not recommended in a production situation without additional security between you and untrusted remote clients.

In solrconfig.xml, you can enable it by changing the following enableRemoteStreaming parameter to true:

    *** WARNING ***
    Before enabling remote streaming, you should make sure your
    system has authentication enabled.

    <requestParsers enableRemoteStreaming="false" />

When enableRemoteStreaming is not specified in solrconfig.xml, the default behavior is to not allow remote streaming (i.e., enableRemoteStreaming="false").

Remote streaming can also be enabled through the Config API as follows:

V1 API

curl -H 'Content-type:application/json' -d '{"set-property": {"requestDispatcher.requestParsers.enableRemoteStreaming":true}}' 'http://localhost:8983/solr/techproducts/config'

V2 API

curl -X POST -H 'Content-type: application/json' -d '{"set-property": {"requestDispatcher.requestParsers.enableRemoteStreaming":true}}' 'http://localhost:8983/api/collections/techproducts/config'

If enableRemoteStreaming="true" is used, be aware that this allows anyone to send a request to any URL or local file. If the DumpRequestHandler is enabled, it will allow anyone to view any file on your system.

The source of the data can be compressed using gzip, and Solr will generally detect this. The detection is based on either the presence of a Content-Encoding: gzip HTTP header or the file ending with .gz or .gzip. Gzip doesn’t apply to stream.body.

Debugging Requests

The implicit "dump" RequestHandler (see Implicit RequestHandlers) simply outputs the contents of the Solr QueryRequest using the specified writer type wt. This is a useful tool to help understand what streams are available to the RequestHandlers.