This reference guide describes Apache Solr, the open source solution for search.
Solr builds on Lucene, an open source Java library that provides indexing and search technology, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. Both Solr and Lucene are managed by the Apache Software Foundation (www.apache.org). You can download Apache Solr from the Solr website at http://lucene.apache.org/solr/.
This Guide contains the following main sections:
Getting Started: This section guides you through the installation and setup of Solr.
Using the Solr Administration User Interface: This section introduces the Solr Web-based user interface. From your browser you can view configuration files, submit queries, view logfile settings and Java environment settings, and monitor and control distributed configurations.
Documents, Fields, and Schema Design: This section describes how Solr organizes its data for indexing. It explains how a Solr schema defines the fields and field types which Solr uses to organize data within the document files it indexes.
Understanding Analyzers, Tokenizers, and Filters: This section explains how Solr prepares text for indexing and searching. Analyzers parse text and produce a stream of tokens, lexical units used for indexing and searching. Tokenizers break field data down into tokens. Filters perform other transformational or selective work on token streams.
Indexing and Basic Data Operations: This section describes the indexing process and basic index operations, such as commit, optimize, and rollback.
Searching: This section presents an overview of the search process in Solr. It describes the main components used in searches, including request handlers, query parsers, and response writers. It lists the query parameters that can be passed to Solr, and it describes features such as boosting and faceting, which can be used to fine-tune search results.
The Well-Configured Solr Instance: This section discusses performance tuning for Solr. It begins with an overview of the solrconfig.xml file, then tells you how to configure cores with solr.xml, how to configure the Lucene index writer, and more.
Monitoring Solr: Administration and monitoring can be performed using the web-based administration console, through the command line interface, or using REST APIs.
Deployment and Operations: An important aspect of Solr is that all operations and deployment can be done online, with minimal or no impact to running applications. This includes minor upgrades and provisioning and removing nodes, backing up and restoring indexes and editing configurations
SolrCloud: This section describes the newest and most exciting of Solr’s new features, SolrCloud, which provides comprehensive distributed capabilities.
Securing Solr: When planning how to secure Solr, you should consider which of the available features or approaches are right for you.
Legacy Scaling and Distribution: This section tells you how to grow a Solr distribution by dividing a large index into sections called shards, which are then distributed across multiple servers, or by replicating a single index across multiple services.
Client APIs: This section tells you how to access Solr through various client APIs, including JavaScript, JSON, and Ruby.