Content Streams
Content streams are bulk data passed with a request to Solr.
When Solr RequestHandlers are accessed using path based URLs, the SolrQueryRequest
object containing the parameters of the request may also contain a list of ContentStreams containing bulk data for the request.
(The name SolrQueryRequest is a bit misleading: it is involved in all requests, regardless of whether it is a query request or an update request.)
Content Stream Sources
Currently request handlers can get content streams in a variety of ways:
-
For multipart file uploads, each file is passed as a stream.
-
For POST requests where the content-type is not
application/x-www-form-urlencoded
, the raw POST body is passed as a stream. The full POST body is parsed as parameters and included in the Solr parameters. -
The contents of parameter
stream.body
is passed as a stream. -
If remote streaming is enabled and URL content is called for during request handling, the contents of each
stream.url
andstream.file
parameters are fetched and passed as a stream.
By default, curl sends a contentType="application/x-www-form-urlencoded"
header.
If you need to test a SolrContentHeader content stream, you will need to set the content type with curl’s -H
flag.
Remote Streaming
Remote streaming lets you send the contents of a URL as a stream to a given Solr RequestHandler. You could use remote streaming to send a remote or local file to an update plugin.
Remote streaming is disabled by default. Enabling it is not recommended in a production situation without additional security between you and untrusted remote clients.
In solrconfig.xml
, you can enable it by changing the following enableRemoteStreaming
parameter to true
:
*** WARNING ***
Before enabling remote streaming, you should make sure your
system has authentication enabled.
<requestParsers enableRemoteStreaming="false" />
When enableRemoteStreaming
is not specified in solrconfig.xml
, the default behavior is to not allow remote streaming (i.e., enableRemoteStreaming="false"
).
Remote streaming can also be enabled through the Config API as follows:
V1 API
curl -H 'Content-type:application/json' -d '{"set-property": {"requestDispatcher.requestParsers.enableRemoteStreaming":true}}' 'http://localhost:8983/solr/techproducts/config'
V2 API
curl -X POST -H 'Content-type: application/json' -d '{"set-property": {"requestDispatcher.requestParsers.enableRemoteStreaming":true}}' 'http://localhost:8983/api/collections/techproducts/config'
If |
The source of the data can be compressed using gzip, and Solr will generally detect this.
The detection is based on either the presence of a Content-Encoding: gzip
HTTP header or the file ending with .gz or .gzip.
Gzip doesn’t apply to stream.body
.
Debugging Requests
The implicit "dump" RequestHandler (see Implicit Request Handlers) simply outputs the contents of the Solr QueryRequest using the specified writer type wt
.
This is a useful tool to help understand what streams are available to the RequestHandlers.