Package org.apache.solr.cli
Class PostTool
java.lang.Object
org.apache.solr.cli.ToolBase
org.apache.solr.cli.PostTool
- All Implemented Interfaces:
Tool
Supports post command in the bin/solr script.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic classUtility class to hold the result form a page fetch -
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringappendParam(String url, String param) Appends a URL query parameter to a URLprotected static URIappendUrlPath(URI uri, String append) Appends to the path of the URLvoidcommit()Does a simple commit operationprotected static StringcomputeFullUrl(URL baseUrl, String link) Computes the full URL based on a base url and a possibly relative link found in the href param of an HTML anchor.voidAfter initialization, call execute to start the post job.getFileFilterFromFileTypes(String fileTypes) getName()Defines the interface to a Solr tool that can be run from this command-line app.static NodeListgetNodesFromXP(Node n, String xpath) Gets all nodes matching an XPathorg.apache.commons.cli.OptionsRetrieve theOptionssupported by this tool.static StringGets the string content of the matching an XPathprotected static StringGuesses the type of file, based on file name suffix Returns "application/octet-stream" if no corresponding mimeMap type.static DocumentmakeDom(byte[] in) Takes a string as input and returns a DOMprotected static StringnormalizeUrlEnding(String link) Normalizes a URL string by removing anchor part and trailing slashvoidoptimize()Does a simple optimize operationbooleanpostData(InputStream data, Long length, OutputStream output, String type, URI uri) Reads data from the data stream and posts it to solr, writes to the response to outputvoidpostFile(Path file, OutputStream output, String type) Opens the file and posts its contents to the solrUrl, writes to response to output.intpostFiles(String[] args, int startIndexInArgs, OutputStream out, String type) Post all filenames provided in argsintpostWebPages(String[] args, int startIndexInArgs, OutputStream out) This method takes as input a list of start URL strings for crawling, converts the URL strings to URI strings and adds each one to a backlog and then starts crawlingvoidrunImpl(org.apache.commons.cli.CommandLine cli) static InputStreamConverts a string to an input streamprotected booleantypeSupported(String type) Uses the mime-type map to reverse lookup whether the file ending for our type is supported by the fileTypes optionprotected intwebCrawl(int level, OutputStream out) A very simple crawler, pulling URLs to fetch from a backlog and then recurses N levels deep if recursive>0.Methods inherited from class org.apache.solr.cli.ToolBase
echo, echoIfVerbose, getConnectionOptions, getRuntime, isVerbose, runTool
-
Field Details
-
DEFAULT_FILE_TYPES
- See Also:
-
DEFAULT_CONTENT_TYPE
- See Also:
-
-
Constructor Details
-
PostTool
-
-
Method Details
-
getName
Description copied from interface:ToolDefines the interface to a Solr tool that can be run from this command-line app. -
getOptions
public org.apache.commons.cli.Options getOptions()Description copied from interface:ToolRetrieve theOptionssupported by this tool.- Specified by:
getOptionsin interfaceTool- Overrides:
getOptionsin classToolBase- Returns:
- The
Optionsthis tool supports.
-
runImpl
-
execute
public void execute(String mode) throws org.apache.solr.client.solrj.SolrServerException, IOException After initialization, call execute to start the post job. This method delegates to the correct mode method.- Throws:
org.apache.solr.client.solrj.SolrServerExceptionIOException
-
postFiles
public int postFiles(String[] args, int startIndexInArgs, OutputStream out, String type) throws IOException Post all filenames provided in args- Parameters:
args- array of file namesstartIndexInArgs- offset to startout- output stream to post data totype- default content-type to use when posting (this may be overridden in auto mode)- Returns:
- number of files posted
- Throws:
IOException- if an I/O error occurs
-
postWebPages
This method takes as input a list of start URL strings for crawling, converts the URL strings to URI strings and adds each one to a backlog and then starts crawling- Parameters:
args- the raw input args from main()startIndexInArgs- offset for where to startout- outputStream to write results to- Returns:
- the number of web pages posted
-
normalizeUrlEnding
Normalizes a URL string by removing anchor part and trailing slash- Returns:
- the normalized URL string
-
webCrawl
A very simple crawler, pulling URLs to fetch from a backlog and then recurses N levels deep if recursive>0. Links are parsed from HTML through first getting an XHTML version using SolrCell with extractOnly, and followed if they are local. The crawler pauses for a default delay of 10 seconds between each fetch, this can be configured in the delay variable. This is only meant for test purposes, as it does not respect robots or anything else fancy :)- Parameters:
level- which level to crawlout- output stream to write to- Returns:
- number of pages crawled on this level and below
-
computeFullUrl
protected static String computeFullUrl(URL baseUrl, String link) throws MalformedURLException, URISyntaxException Computes the full URL based on a base url and a possibly relative link found in the href param of an HTML anchor.- Parameters:
baseUrl- the base url from where the link was foundlink- the absolute or relative link- Returns:
- the string version of the full URL
- Throws:
MalformedURLExceptionURISyntaxException
-
typeSupported
Uses the mime-type map to reverse lookup whether the file ending for our type is supported by the fileTypes option- Parameters:
type- what content-type to lookup- Returns:
- true if this is a supported content type
-
commit
Does a simple commit operation- Throws:
IOExceptionorg.apache.solr.client.solrj.SolrServerException
-
optimize
Does a simple optimize operation- Throws:
IOExceptionorg.apache.solr.client.solrj.SolrServerException
-
appendParam
Appends a URL query parameter to a URL- Parameters:
url- the original URLparam- the parameter(s) to append, separated by "&"- Returns:
- the string version of the resulting URL
-
postFile
public void postFile(Path file, OutputStream output, String type) throws MalformedURLException, URISyntaxException Opens the file and posts its contents to the solrUrl, writes to response to output. -
appendUrlPath
Appends to the path of the URL- Parameters:
uri- the URIappend- the path to append- Returns:
- the final URL version
-
guessType
Guesses the type of file, based on file name suffix Returns "application/octet-stream" if no corresponding mimeMap type.- Parameters:
path- path to the file- Returns:
- the content-type guessed
-
postData
Reads data from the data stream and posts it to solr, writes to the response to output- Returns:
- true if success
-
stringToStream
Converts a string to an input stream- Parameters:
s- the string- Returns:
- the input stream
-
getFileFilterFromFileTypes
-
getNodesFromXP
Gets all nodes matching an XPath- Throws:
XPathExpressionException
-
getXP
Gets the string content of the matching an XPath- Parameters:
n- the node (or doc)xpath- the xpath stringconcatAll- if true, text from all matching nodes will be concatenated, else only the first returned- Throws:
XPathExpressionException
-
makeDom
public static Document makeDom(byte[] in) throws SAXException, IOException, ParserConfigurationException Takes a string as input and returns a DOM
-