Post Tool

Solr includes a simple command line tool for POSTing various types of content to a Solr server that is part of the bin/solr CLI.

This tool is meant for use by new users exploring Solr’s capabilities, and is not intended as a robust solution to be used for indexing documents into production systems.
You may be familiar with SimplePostTool and the bin/post Unix shell script. While this is still available, it is deprecated and will be removed in Solr 10.

To run it, open a window and enter:

$ bin/solr post -url http://localhost:8983/gettingstarted/update example/films/films.json

This will contact the server at localhost:8983. The --help (or simply -h) option will output information on its usage (i.e., bin/solr post -h).

Using the bin/solr post Tool

You must either specify url that is the full path to the update handler or provide a c collection/core name when using bin/solr post.

This specifies the same target collection: -url http://localhost:8983/gettingstarted/update or -c gettingstarted.

The basic usage of bin/solr post is:

usage: post
 -c,--name <NAME>                                 Name of the collection.
 -d,--delay <delay>                               If recursive then delay
                                                  will be the wait time
                                                  between posts.  default:
                                                  10 for web, 0 for files
    --dry-run                                     Performs a dry run of
                                                  the posting process
                                                  without actually sending
                                                  documents to Solr.  Only
                                                  works with files mode.
 -f,--format                                      sends application/json
                                                  content as Solr commands
                                                  to /update instead of
                                                  /update/json/docs.
 -ft,--filetypes <<type>[,<type>,...]>            default:
                                                  xml,json,jsonl,csv,pdf,d
                                                  oc,docx,ppt,pptx,xls,xls
                                                  x,odt,odp,ods,ott,otp,ot
                                                  s,rtf,htm,html,txt,log
 -h,--help                                        Print this message.
    --mode <mode>                                 Which mode the Post tool
                                                  is running in, 'files'
                                                  crawls local directory,
                                                  'web' crawls website,
                                                  'args' processes input
                                                  args, and 'stdin' reads
                                                  a command from standard
                                                  in. default: files.
 -o,--optimize                                    Issue an optimize at end
                                                  of posting documents.
    --out                                         sends Solr response
                                                  outputs to console.
 -p,--params <<key>=<value>[&<key>=<value>...]>   values must be
                                                  URL-encoded; these pass
                                                  through to Solr update
                                                  request.
 -r,--recursive <recursive>                       For web crawl, how deep
                                                  to go. default: 1
    --skip-commit                                 Do not 'commit', and
                                                  thus changes won't be
                                                  visible till a commit
                                                  occurs.
 -t,--type <content-type>                         Specify a specific
                                                  mimetype to use, such as
                                                  application/json.
 -u,--credentials <credentials>                   Credentials in the
                                                  format
                                                  username:password.
                                                  Example: --credentials
                                                  solr:SolrRocks
 -url,--solr-update-url <UPDATEURL>               Solr Update URL, the
                                                  full url to the update
                                                  handler, including the
                                                  /update.
 -v,--verbose                                     Enable more verbose
                                                  command output.

Examples Using bin/solr post

There are several ways to use bin/solr post. This section presents several examples.

Indexing JSON

Index all JSON files into gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.json

Indexing XML

Add all documents with file extension .xml to the collection named gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update *.xml

Add all documents starting with article with file extension .xml to the gettingstarted collection on Solr running on port 8984.

$ bin/solr post -url http://localhost:8984/solr/gettingstarted/update article*.xml

Send XML arguments to delete a document from gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update --mode args --type application/xml '<delete><id>42</id></delete>'

Indexing CSV and JSON

Index all CSV and JSON files into gettingstarted from current directory:

$ bin/solr post -c gettingstarted --filetypes json,csv .

Index a tab-separated file into gettingstarted:

$ bin/solr post -url http://localhost:8984/solr/signals/update --params "separator=%09" --type text/csv data.tsv

The content type (-type) parameter is required to treat the file as the proper type, otherwise it will be ignored and a WARNING logged as it does not know what type of content a .tsv file is. The CSV handler supports the separator parameter, and is passed through using the -params setting.

Indexing Rich Documents (PDF, Word, HTML, etc.)

Index a PDF file into gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update a.pdf

Automatically detect content types in a folder, and recursively scan it for documents for indexing into gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update afolder/

Automatically detect content types in a folder, but limit it to PPT and HTML files and index into gettingstarted.

$ bin/solr post -url http://localhost:8983/solr/gettingstarted/update --filetypes ppt,html afolder/

Indexing to a Password Protected Solr (Basic Auth)

Index a PDF as the user "solr" with password "SolrRocks":

$ bin/solr post -u solr:SolrRocks -url http://localhost:8983/solr/gettingstarted/update a.pdf

Crawling a Website to Index Documents

Crawl the Apache Solr website going one layer deep and indexing the pages into Solr.

See Trying Out Solr Cell to learn more about setting up Solr for extracting content from web pages.

$ bin/solr post --mode web -c gettingstarted --recursive 1 --delay 1 https://solr.apache.org/

Standard Input as Source for Indexing

You can use the standard input as your source for data to index. Notice the -out providing raw responses from Solr.

$ echo '{commit: {}}' | bin/solr post --mode stdin -url http://localhost:8983/my_collection/update --out

Raw Data as Source for Indexing

Provide the raw document as a string for indexing.

$ bin/solr post -url http://localhost:8983/signals/update -mode args --type text/csv -out $'id,value\n1,0.47'