Introduction to Solr

Solr is a search server built on top of Apache Lucene, an open source, Java-based, information retrieval library. It is designed to drive powerful document retrieval applications - wherever you need to serve data to users based on their queries, Solr can work for you.

Solr is based on open standards and it is highly extensible. Solr queries are simple HTTP request URLs and the response is a structured document: mainly JSON, but it could also be XML, CSV, or other formats. This means that a wide variety of clients will be able to use Solr, from other web applications to browser clients, rich client applications, and mobile devices. Any platform capable of HTTP can talk to Solr. See Client APIs for details on client APIs.

Flexible schema configurations allow nearly any type of data to be stored in Solr. The Schema Elements has more details on these options.

Solr offers support for the simplest keyword searching through to complex queries on multiple fields and faceted search results. Collapsing and clustering results offer compelling features for e-commerce and storefronts. Powerful math expressions provide the backbone for advanced analytics use cases. The Query Syntax and Parsers has more information about searching and queries.

If Solr’s capabilities are not impressive enough, its ability to handle very high-volume applications should do the trick.

A relatively common scenario is that you have so much data, or so many queries, that a single Solr server is unable to handle your entire workload. In this case, you can scale up the capabilities of your application using Solr Cluster Types to better distribute the data, and the processing of requests, across many servers. Multiple options can be mixed and matched depending on the scalability you need.

Best of all, this talk about high-volume applications is not just hypothetical: some of the famous Internet sites that use Solr today are Macy’s, EBay, and Zappo’s. For more examples, take a look at