SolrCloud uses ZooKeeper for shared information and for coordination.
This section describes how to configure Solr to add more restrictive ACLs to the ZooKeeper content it creates, and how to tell Solr about the credentials required to access the content in ZooKeeper. If you want to use ACLs in your ZooKeeper nodes, you will have to activate this functionality; by default, Solr behavior is open-unsafe ACL everywhere and uses no credentials.
Content stored in ZooKeeper is critical to the operation of a SolrCloud cluster. Open access to SolrCloud content on ZooKeeper could lead to a variety of problems. For example:
Changing configuration might cause Solr to fail or behave in an unintended way.
Changing cluster state information into something wrong or inconsistent might very well make a SolrCloud cluster behave strangely.
Adding a delete-collection job to be carried out by the Overseer will cause data to be deleted from the cluster.
You may want to enable ZooKeeper ACLs with Solr if you grant access to your ZooKeeper ensemble to entities you do not trust, or if you want to reduce risk of bad actions resulting from, for example:
Malware that found its way into your system.
Other systems using the same ZooKeeper ensemble (a "bad thing" might be done by accident).
You might even want to limit read-access, if you think there is stuff in ZooKeeper that not everyone should know about. Or you might just in general work on a need-to-know basis.
Protecting ZooKeeper itself could mean many different things. This section is about protecting Solr content in ZooKeeper. ZooKeeper content basically lives persisted on disk and (partly) in memory of the ZooKeeper processes. This section is not about protecting ZooKeeper data at storage or ZooKeeper process levels - that’s for ZooKeeper to deal with.
But this content is also available to "the outside" via the ZooKeeper API. Outside processes can connect to ZooKeeper and create/update/delete/read content; for example, a Solr node in a SolrCloud cluster wants to create/update/delete/read, and a SolrJ client wants to read from the cluster. It is the responsibility of the outside processes that create/update content to setup ACLs on the content. ACLs describe who is allowed to read, update, delete, create, etc. Each piece of information (znode/content) in ZooKeeper has its own set of ACLs, and inheritance or sharing is not possible. The default behavior in Solr is to add one ACL on all the content it creates - one ACL that gives anyone the permission to do anything (in ZooKeeper terms this is called "the open-unsafe ACL").
How to Enable ACLs
We want to be able to:
Control the credentials Solr uses for its ZooKeeper connections. The credentials are used to get permission to perform operations in ZooKeeper.
Control which ACLs Solr will add to znodes (ZooKeeper files/folders) it creates in ZooKeeper.
Control it "from the outside", so that you do not have to modify and/or recompile Solr code to turn this on.
Solr nodes, clients and tools (e.g., ZkCLI) always use a java class called SolrZkClient to deal with their ZooKeeper stuff. The implementation of the solution described here is all about changing SolrZkClient. If you use SolrZkClient in your application, the descriptions below will be true for your application too.
You control which credentials provider will be used by configuring the zkCredentialsProvider property in solr.xml 's <solrcloud> section to the name of a class (on the classpath) implementing the ZkCredentialsProvider interface. server/solr/solr.xml in the Solr distribution defines the zkCredentialsProvider such that it will take on the value of the same-named zkCredentialsProvider system property if it is defined (e.g., by uncommenting the SOLR_ZK_CREDS_AND_ACLS environment variable definition in solr.in.sh/.cmd - see below), or if not, default to the DefaultZkCredentialsProvider implementation.
Out of the Box Credential Implementations
You can always make you own implementation, but Solr comes with two implementations:
org.apache.solr.common.cloud.DefaultZkCredentialsProvider: Its getCredentials() returns a list of length zero, or "no credentials used". This is the default.
org.apache.solr.common.cloud.VMParamsSingleSetCredentialsDigestZkCredentialsProvider: This lets you define your credentials using system properties. It supports at most one set of credentials.
The schema is "digest". The username and password are defined by system properties zkDigestUsername and zkDigestPassword. This set of credentials will be added to the list of credentials returned by getCredentials() if both username and password are provided.
If the one set of credentials above is not added to the list, this implementation will fall back to default behavior and use the (empty) credentials list from DefaultZkCredentialsProvider.
You control which ACLs will be added by configuring zkACLProvider property in solr.xml 's <solrcloud> section to the name of a class (on the classpath) implementing the ZkACLProvider interface. server/solr/solr.xml in the Solr distribution defines the zkACLProvider such that it will take on the value of the same-named zkACLProvider system property if it is defined (e.g., by uncommenting the SOLR_ZK_CREDS_AND_ACLS environment variable definition in solr.in.sh/.cmd - see below), or if not, default to the DefaultZkACLProvider implementation.
Out of the Box ACL Implementations
You can always make you own implementation, but Solr comes with:
org.apache.solr.common.cloud.DefaultZkACLProvider: It returns a list of length one for all zNodePath-s. The single ACL entry in the list is "open-unsafe". This is the default.
org.apache.solr.common.cloud.VMParamsAllAndReadonlyDigestZkACLProvider: This lets you define your ACLs using system properties. Its getACLsToAdd() implementation will apply only admin ACLs to pre-defined sensitive paths as defined by SecurityAwareZkACLProvider (/security.json and /security/*) and both admin and user ACLs to the rest of the contents. The two sets of roles will be defined as:
A user that is allowed to do everything.
The permission is ALL (corresponding to all of CREATE, READ, WRITE, DELETE, and ADMIN), and the schema is "digest".
The username and password are defined by system properties zkDigestUsername and zkDigestPassword, respectively.
This ACL will not be added to the list of ACLs unless both username and password are provided.
A user that is only allowed to perform read operations.
The permission is READ and the schema is digest.
The username and password are defined by system properties zkDigestReadonlyUsername and zkDigestReadonlyPassword, respectively.
This ACL will not be added to the list of ACLs unless both username and password are provided.
org.apache.solr.common.cloud.SaslZkACLProvider: Requires SASL authentication. Gives all permissions for the user specified in system property solr.authorization.superuser (default: solr) when using SASL, and gives read permissions for anyone else. Designed for a setup where configurations have already been set up and will not be modified, or where configuration changes are controlled via Solr APIs. This provider will be useful for administration in a kerberos environment. In such an environment, the administrator wants Solr to authenticate to ZooKeeper using SASL, since this is only way to authenticate with ZooKeeper via Kerberos.
If none of the above ACLs is added to the list, the (empty) ACL list of DefaultZkACLProvider will be used by default.
Notice the overlap in system property names with credentials provider VMParamsSingleSetCredentialsDigestZkCredentialsProvider (described above). This is to let the two providers collaborate in a nice and perhaps common way: we always protect access to content by limiting to two users - an admin-user and a readonly-user - AND we always connect with credentials corresponding to this same admin-user, basically so that we can do anything to the content/znodes we create ourselves.
You can give the readonly credentials to "clients" of your SolrCloud cluster - e.g., to be used by SolrJ clients. They will be able to read whatever is necessary to run a functioning SolrJ client, but they will not be able to modify any content in ZooKeeper.
ZooKeeper ACLs in Solr Scripts
There are two scripts that impact ZooKeeper ACLs:
For *nix systems: bin/solr & server/scripts/cloud-scripts/zkcli.sh
For Windows systems: bin/solr.cmd & server/scripts/cloud-scripts/zkcli.bat
Both the solr.in.* and the zkcli.* files will need to be updated with the same password for everything to work. The contents may appear redundant, but the scripts will not consult each other during operations.
These Solr scripts can enable use of ZooKeeper ACLs by setting the appropriate system properties: uncomment the following and replace the passwords with ones you choose to enable the above-described VM parameters ACL and credentials providers in the following files:
Changing ACL Schemes
Over the lifetime of operating your Solr cluster, you may decide to move from an unsecured ZooKeeper to a secured instance. Changing the configured zkACLProvider in solr.xml will ensure that newly created nodes are secure, but will not protect the already existing data. To modify all existing ACLs, you can use the updateacls command with Solr’s ZkCLI. First uncomment the SOLR_ZK_CREDS_AND_ACLS environment variable definition in server/scripts/cloud-scripts/zkcli.sh (or zkcli.bat on Windows) and fill in the passwords for the admin-user and the readonly-user - see above - then run server/scripts/cloud-scripts/zkcli.sh -cmd updateacls /zk-path, or on Windows run server\scripts\cloud-scripts\zkcli.bat cmd updateacls /zk-path.
Changing ACLs in ZK should only be done while your SolrCloud cluster is stopped. Attempting to do so while Solr is running may result in inconsistent state and some nodes becoming inaccessible.
The VM properties zkACLProvider and zkCredentialsProvider, included in the SOLR_ZK_CREDS_AND_ACLS environment variable in zkcli.sh/.bat, control the conversion:
The Credentials Provider must be one that has current admin privileges on the nodes. When omitted, the process will use no credentials (suitable for an unsecure configuration).
The ACL Provider will be used to compute the new ACLs. When omitted, the process will set all permissions to all users, removing any security present.
The uncommented SOLR_ZK_CREDS_AND_ACLS environment variable in zkcli.sh/.bat sets the credentials and ACL providers to the VMParamsSingleSetCredentialsDigestZkCredentialsProvider and VMParamsAllAndReadonlyDigestZkACLProvider implementations, described earlier in the page.