ZooKeeper Ensemble Configuration
Although Solr comes bundled with Apache ZooKeeper, you are strongly encouraged to use an external ZooKeeper setup in production.
While using Solr’s embedded ZooKeeper instance is fine for getting started, you shouldn’t use this in production because it does not provide any failover: if the Solr instance that hosts ZooKeeper shuts down, ZooKeeper is also shut down. Any shards or Solr instances that rely on it will not be able to communicate with it or each other.
The solution to this problem is to set up an external ZooKeeper ensemble, which is a number of servers running ZooKeeper that communicate with each other to coordinate the activities of the cluster.
How Many ZooKeeper Nodes?
The first question to answer is the number of ZooKeeper nodes you will run in your ensemble.
When planning how many ZooKeeper nodes to configure, keep in mind that the main principle for a ZooKeeper ensemble is maintaining a majority of servers to serve requests. This majority is called a quorum.
"For a ZooKeeper service to be active, there must be a majority of non-failing machines that can communicate with each other. To create a deployment that can tolerate the failure of F machines, you should count on deploying 2xF+1 machines.
http://zookeeper.apache.org/doc/r3.9.1/zookeeperAdmin.html
To properly maintain a quorum, it’s highly recommended to have an odd number of ZooKeeper servers in your ensemble, so a majority is maintained.
To explain why, think about this scenario: If you have two ZooKeeper nodes and one goes down, this means only 50% of available servers are available. Since this is not a majority, ZooKeeper will no longer serve requests.
However, if you have three ZooKeeper nodes and one goes down, you have 66% of your servers available and ZooKeeper will continue normally while you repair the one down node. If you have 5 nodes, you could continue operating with two down nodes if necessary.
It’s not generally recommended to go above 5 nodes. While it may seem that more nodes provide greater fault-tolerance and availability, in practice it becomes less efficient because of the amount of inter-node coordination that occurs. Unless you have a truly massive Solr cluster (on the scale of 1,000s of nodes), try to stay to 3 as a general rule, or maybe 5 if you have a larger cluster.
More information on ZooKeeper clusters is available from the ZooKeeper documentation at http://zookeeper.apache.org/doc/r3.9.1/zookeeperAdmin.html#sc_zkMulitServerSetup.
Download Apache ZooKeeper
The first step in setting up Apache ZooKeeper is, of course, to download the software. It’s available from http://zookeeper.apache.org/releases.html.
Solr currently uses Apache ZooKeeper v3.9.1.
When using an external ZooKeeper ensemble, you will need need to keep your local installation up-to-date with the latest version distributed with Solr. Since it is a stand-alone application in this scenario, it does not get upgraded as part of a standard Solr upgrade. |
ZooKeeper Installation
Installation consists of extracting the files into a specific target directory where you’d like to have ZooKeeper store its internal data. The actual directory itself doesn’t matter, as long as you know where it is.
The command to unpack the ZooKeeper package is:
tar xvf zookeeper-3.9.1.tar.gz
This location is the <ZOOKEEPER_HOME>
for ZooKeeper on this server.
Installing and unpacking ZooKeeper must be repeated on each server where ZooKeeper will be run.
Configuration for a ZooKeeper Ensemble
After installation, we’ll first take a look at the basic configuration for ZooKeeper, then specific parameters for configuring each node to be part of an ensemble.
Initial Configuration
To configure your ZooKeeper instance, create a file named <ZOOKEEPER_HOME>/conf/zoo.cfg
.
A sample configuration file is included in your ZooKeeper installation, as conf/zoo_sample.cfg
.
You can edit and rename that file instead of creating it new if you prefer.
The file should have the following information to start:
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
4lw.commands.whitelist=mntr,conf,ruok
The parameters are as follows:
tickTime
-
Required
Default: none
Part of what ZooKeeper does is determine which servers are up and running at any given time, and the minimum session timeout is defined as two "ticks". The
tickTime
parameter specifies in milliseconds how long each tick should be. dataDir
-
Required
Default: none
This is the directory in which ZooKeeper will store data about the cluster. This directory must be empty before starting ZooKeeper for the first time.
clientPort
-
Required
Default: none
This is the port on which Solr will access ZooKeeper.
4lw.commands.whitelist
-
Optional
Default: none
This allows the Solr admin UI to query ZooKeeper. Optionally use
*
to enable all "4 letter words", the three listed (mntr
,conf
,ruok
) will enable the admin UI.
These are the basic parameters that need to be in use on each ZooKeeper node, so this file must be copied to or created on each node.
Next we’ll customize this configuration to work within an ensemble.
Ensemble Configuration
To complete configuration for an ensemble, we need to set additional parameters so each node knows who it is in the ensemble and where every other node is.
Each of the examples below assume you are installing ZooKeeper on different servers with different hostnames.
Once complete, your zoo.cfg
file might look like this:
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
4lw.commands.whitelist=mntr,conf,ruok
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
We’ve added these parameters to the three we had already:
initLimit
-
Required
Default: none
Amount of time, in ticks, to allow followers to connect and sync to a leader. In this case, you have
5
ticks, each of which is 2000 milliseconds long, so the server will wait as long as10
seconds to connect and sync with the leader. syncLimit
-
Required
Default: none
Amount of time, in ticks, to allow followers to sync with ZooKeeper. If followers fall too far behind a leader, they will be dropped.
server.X
-
Required
Default: none
These are the server IDs (the
X
part), hostnames (or IP addresses) and ports for all servers in the ensemble. The IDs differentiate each node of the ensemble, and allow each node to know where each of the other node is located. The ports can be any ports you choose; ZooKeeper’s default ports are2888:3888
.Since we’ve assigned server IDs to specific hosts/ports, we must also define which server in the list this node is. We do this with a
myid
file stored in the data directory (defined by thedataDir
parameter). The contents of themyid
file is only the server ID.In the case of the configuration example above, you would create the file
/var/lib/zookeeper/1/myid
with the content "1" (without quotes), as in this example:1
autopurge.snapRetainCount
-
Optional
Default:
3
The number of snapshots and corresponding transaction logs to retain when purging old snapshots and transaction logs.
ZooKeeper automatically keeps a transaction log and writes to it as changes are made. A snapshot of the current state is taken periodically, and this snapshot supersedes transaction logs older than the snapshot. However, ZooKeeper never cleans up either the old snapshots or the old transaction logs; over time they will silently fill available disk space on each server.
To avoid this, set the
autopurge.snapRetainCount
andautopurge.purgeInterval
parameters to enable an automatic clean up (purge) to occur at regular intervals. Theautopurge.snapRetainCount
parameter will keep the defined number of snapshots and transaction logs when a clean up occurs. This parameter can be configured higher than3
, but cannot be set lower than 3. autopurge.purgeInterval
-
Optional
Default:
0
The time in hours between purge tasks. The default for this parameter is
0
, so must be set to1
or higher to enable automatic clean up of snapshots and transaction logs. Setting it as high as24
, for once a day, is acceptable if preferred.
We’ll repeat this configuration on each node.
On the second node, update <ZOOKEEPER_HOME>/conf/zoo.cfg
file so it matches the content on node 1 (particularly the server hosts and ports):
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
4lw.commands.whitelist=mntr,conf,ruok
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
On the second node, create a myid
file with the contents "2", and put it in the /var/lib/zookeeper
directory.
2
On the third node, update <ZOOKEEPER_HOME>/conf/zoo.cfg
file so it matches the content on nodes 1 and 2 (particularly the server hosts and ports):
tickTime=2000
dataDir=/var/lib/zookeeper
clientPort=2181
4lw.commands.whitelist=mntr,conf,ruok
initLimit=5
syncLimit=2
server.1=zoo1:2888:3888
server.2=zoo2:2888:3888
server.3=zoo3:2888:3888
autopurge.snapRetainCount=3
autopurge.purgeInterval=1
And create the myid
file in the /var/lib/zookeeper
directory:
3
Repeat this for servers 4 and 5 if you are creating a 5-node ensemble (a rare case).
ZooKeeper Environment Configuration
To ease troubleshooting in case of problems with the ensemble later, it’s recommended to run ZooKeeper with logging enabled and with proper JVM garbage collection (GC) settings.
-
Create a file named
zookeeper-env.sh
and put it in the<ZOOKEEPER_HOME>/conf
directory (the same place you putzoo.cfg
). This file will need to exist on each server of the ensemble. -
Add the following settings to the file:
ZOO_LOG_DIR="/path/for/log/files" ZOO_LOG4J_PROP="INFO,ROLLINGFILE" SERVER_JVMFLAGS="-Xms2048m -Xmx2048m -verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:$ZOO_LOG_DIR/zookeeper_gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=9 -XX:GCLogFileSize=20M"
The property
ZOO_LOG_DIR
defines the location on the server where ZooKeeper will print its logs.ZOO_LOG4J_PROP
sets the logging level and log appenders.With
SERVER_JVMFLAGS
, we’ve defined several parameters for garbage collection and logging GC-related events. One of the system parameters is-Xloggc:$ZOO_LOG_DIR/zookeeper_gc.log
, which will put the garbage collection logs in the same directory we’ve defined for ZooKeeper logs, in a file namedzookeeper_gc.log
. -
Review the default settings in
<ZOOKEEPER_HOME>/conf/log4j.properties
, especially thelog4j.appender.ROLLINGFILE.MaxFileSize
parameter. This sets the size at which log files will be rolled over, and by default it is 10MB. -
Copy
zookeeper-env.sh
and any changes tolog4j.properties
to each server in the ensemble.
The above instructions are for Linux servers only.
The default zkServer.sh script includes support for a zookeeper-env.sh file but the Windows version of the script, zkServer.cmd , does not.
To make the same configuration on a Windows server, the changes would need to be made directly in the zkServer.cmd .
|
At this point, you are ready to start your ZooKeeper ensemble.
More Information about ZooKeeper
ZooKeeper provides a great deal of power through additional configurations, but delving into them is beyond the scope of Solr’s documentation. For more information, see the ZooKeeper documentation.
Starting and Stopping ZooKeeper
Start ZooKeeper
To start the ensemble, use the <ZOOKEEPER_HOME>/bin/zkServer.sh
or zkServer.cmd
script, as with this command:
zkServer.sh start
zkServer.cmd start
This command needs to be run on each server that will run ZooKeeper.
You should see the ZooKeeper logs in the directory where you defined to store them.
However, immediately after startup, you may not see the zookeeper_gc.log yet, as it likely will not appear until garbage collection has happened the first time.
|
Solr Configuration
When starting Solr, you must provide an address for ZooKeeper or Solr won’t know how to use it. This can be done in two ways: by defining the connect string, a list of servers where ZooKeeper is running, at every startup on every node of the Solr cluster, or by editing Solr’s include file as a permanent system parameter. Both approaches are described below.
When referring to the location of ZooKeeper within Solr, it’s best to use the addresses of all the servers in the ensemble. If one happens to be down, Solr will automatically be able to send its request to another server in the list.
ZooKeeper version 3.5 and later supports dynamic reconfiguration of server addresses and roles. But note that Solr will only be able to talk to the servers listed in the static ZooKeeper connect string.
Using a chroot
If your ensemble is or will be shared among other systems besides Solr, you should consider defining application-specific znodes, or a hierarchical namespace that will only include Solr’s files.
Once you create a znode for each application, you add it’s name, also called a chroot, to the end of your connect string whenever you tell Solr where to access ZooKeeper.
Creating a chroot is done with a bin/solr
command:
bin/solr zk mkroot /solr -z zk1:2181,zk2:2181,zk3:2181
See the section Create a znode for more examples of this command.
Once the znode is created, it behaves in a similar way to a directory on a filesystem: the data stored by Solr in ZooKeeper is nested beneath the main data directory and won’t be mixed with data from another system or process that uses the same ZooKeeper ensemble.
Alternatively, Solr can create the znode automatically as part of the first Solr instance’s startup, read below for an example.
Using the -z Parameter with bin/solr
Pointing Solr at the ZooKeeper ensemble you’ve created is a simple matter of using the -z
parameter when using the bin/solr
script.
For example, to point the Solr instance to the ZooKeeper you’ve started on port 2181 on three servers with chroot /solr
(see Using a chroot above), this is what you’d need to do:
bin/solr start -e cloud -z zk1:2181,zk2:2181,zk3:2181/solr
If the znode does not exist, you can set the ZK_CREATE_CHROOT environment variable to true to create it automatically on start-up:
ZK_CREATE_CHROOT=true bin/solr start -e cloud -z zk1:2181,zk2:2181,zk3:2181/solr
Updating Solr Include Files
If you update Solr include files (solr.in.sh
or solr.in.cmd
), which overrides defaults used with bin/solr
, you will not have to use the -z
parameter with bin/solr
commands.
Linux: solr.in.sh
The section to look for will be commented out:
# Set the ZooKeeper connection string if using an external ZooKeeper ensemble
# e.g. host1:2181,host2:2181/chroot
# Leave empty if not using SolrCloud
#ZK_HOST=""
Remove the comment marks at the start of the line and enter the ZooKeeper connect string:
# Set the ZooKeeper connection string if using an external ZooKeeper ensemble
# e.g. host1:2181,host2:2181/chroot
# Leave empty if not using SolrCloud
ZK_HOST="zk1:2181,zk2:2181,zk3:2181/solr"
Windows: solr.in.cmd
The section to look for will be commented out:
REM Set the ZooKeeper connection string if using an external ZooKeeper ensemble
REM e.g. host1:2181,host2:2181/chroot
REM Leave empty if not using SolrCloud
REM set ZK_HOST=
Remove the comment marks at the start of the line and enter the ZooKeeper connect string:
REM Set the ZooKeeper connection string if using an external ZooKeeper ensemble
REM e.g. host1:2181,host2:2181/chroot
REM Leave empty if not using SolrCloud
set ZK_HOST=zk1:2181,zk2:2181,zk3:2181/solr
Now you will not have to enter the connection string when starting Solr.
Increasing the File Size Limit
ZooKeeper is designed to hold small files, on the order of kilobytes. By default, ZooKeeper’s file size limit is 1MB. Attempting to write or read files larger than this will cause errors.
Some Solr features, e.g., text analysis synonyms, LTR, and OpenNLP named entity recognition, require configuration resources that can be larger than the default limit.
ZooKeeper can be configured, via Java system property jute.maxbuffer
, to increase this limit.
Note that this configuration, which is required both for ZooKeeper server(s) and for all clients that connect to the server(s), must be the same everywhere it is specified.
Configuring jute.maxbuffer on ZooKeeper Nodes
jute.maxbuffer
must be configured on each external ZooKeeper node.
This can be achieved in any of the following ways; note though that only the first option works on Windows:
-
In
<ZOOKEEPER_HOME>/conf/zoo.cfg
, e.g., to increase the file size limit to one byte less than 10MB, add this line:jute.maxbuffer=0x9fffff
-
In
<ZOOKEEPER_HOME>/conf/zookeeper-env.sh
, e.g., to increase the file size limit to 50MiB, add this line:JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=50000000"
-
In
<ZOOKEEPER_HOME>/bin/zkServer.sh
, add aJVMFLAGS
environment variable assignment near the top of the script, e.g., to increase the file size limit to 5MiB:JVMFLAGS="$JVMFLAGS -Djute.maxbuffer=5000000"
Configuring jute.maxbuffer for ZooKeeper Clients
The bin/solr
script invokes Java programs that act as ZooKeeper clients.
When you use Solr’s bundled ZooKeeper server instead of setting up an external ZooKeeper ensemble, the configuration described below will also configure the ZooKeeper server.
Add the setting to the SOLR_OPTS
environment variable in Solr’s include file (bin/solr.in.sh
or solr.in.cmd
):
Linux: solr.in.sh
The section to look for will start:
# Anything you add to the SOLR_OPTS variable will be included in the java
# start command line as-is, in ADDITION to other options. If you specify the
# -a option on start script, those options will be appended as well. Examples:
Add the following line to increase the file size limit to 2MB:
SOLR_OPTS="$SOLR_OPTS -Djute.maxbuffer=0x200000"
Windows: solr.in.cmd
The section to look for will start:
REM Anything you add to the SOLR_OPTS variable will be included in the java
REM start command line as-is, in ADDITION to other options. If you specify the
REM -a option on start script, those options will be appended as well. Examples:
Add the following line to increase the file size limit to 2MB:
set SOLR_OPTS=%SOLR_OPTS% -Djute.maxbuffer=0x200000
Securing the ZooKeeper Connection
You may also want to secure the communication between ZooKeeper and Solr.
To setup ACL protection of znodes, see the section ZooKeeper Access Control.