Package org.apache.solr.cloud
Class ZkController
- java.lang.Object
-
- org.apache.solr.cloud.ZkController
-
- All Implemented Interfaces:
Closeable
,AutoCloseable
public class ZkController extends Object implements Closeable
Handle ZooKeeper interactions.notes: loads everything on init, creates what's not there - further updates are prompted with Watches.
TODO: exceptions during close on attempts to update cloud state
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ZkController.NotInClusterStateException
Thrown during pre register process if the replica is not present in clusterstatestatic class
ZkController.ResourceModifiedInZkException
-
Field Summary
Fields Modifier and Type Field Description static String
COLLECTION_PARAM_PREFIX
static String
CONFIGNAME_PROP
protected Overseer
overseer
static byte[]
TOUCHED_ZNODE_DATA
org.apache.solr.common.cloud.ZkStateReader
zkStateReader
-
Constructor Summary
Constructors Constructor Description ZkController(CoreContainer cc, String zkServerAddress, int zkClientConnectTimeout, CloudConfig cloudConfig, Supplier<List<CoreDescriptor>> descriptorsSupplier)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addOnReconnectListener(org.apache.solr.common.cloud.OnReconnect listener)
Add a listener to be notified once there is a new session created after a ZooKeeper session expiration occurs; in most cases, listeners will be components that have watchers that need to be re-created.static boolean
checkChrootPath(String zkHost, boolean create)
Validates if the chroot exists in zk (or if it is successfully created).boolean
checkIfCoreNodeNameAlreadyExists(CoreDescriptor dcore)
void
checkOverseerDesignate()
boolean
claimAsyncId(String asyncId)
When an operation needs to be performed in an asynchronous mode, the asyncId needs to be claimed by calling this method to make sure it's not duplicate (hasn't been claimed by other request).boolean
clearAsyncId(String asyncId)
Clears an asyncId previously claimed by callingclaimAsyncId(String)
void
clearZkCollectionTerms()
void
close()
Closes the underlying ZooKeeper client.static void
createClusterZkNodes(org.apache.solr.common.cloud.SolrZkClient zkClient)
Create the zknodes necessary for a cluster to operateString
getBaseUrl()
org.apache.solr.common.cloud.ClusterState
getClusterState()
org.apache.solr.common.cloud.OnReconnect
getConfigDirListener()
CoreContainer
getCoreContainer()
String
getCoreNodeName(CoreDescriptor descriptor)
DistributedClusterStateUpdater
getDistributedClusterStateUpdater()
String
getHostName()
int
getHostPort()
int
getLeaderConflictResolveWait()
org.apache.solr.common.cloud.ZkCoreNodeProps
getLeaderProps(String collection, String slice, int timeoutms)
Get leader props directly from zk nodes.org.apache.solr.common.cloud.ZkCoreNodeProps
getLeaderProps(String collection, String slice, int timeoutms, boolean failImmediatelyOnExpiration)
Get leader props directly from zk nodes.int
getLeaderVoteWait()
String
getNodeName()
Overseer
getOverseer()
OverseerTaskQueue
getOverseerCollectionQueue()
DistributedMap
getOverseerCompletedMap()
OverseerTaskQueue
getOverseerConfigSetQueue()
LeaderElector
getOverseerElector()
DistributedMap
getOverseerFailureMap()
ZkDistributedQueue
getOverseerJobQueue()
DistributedMap
getOverseerRunningMap()
ZkShardTerms
getShardTerms(String collection, String shardId)
org.apache.solr.client.solrj.cloud.SolrCloudManager
getSolrCloudManager()
org.apache.solr.common.cloud.NodesSysPropsCacher
getSysPropsCacher()
org.apache.solr.common.cloud.SolrZkClient
getZkClient()
String
getZkServerAddress()
org.apache.solr.common.cloud.ZkStateReader
getZkStateReader()
void
giveupLeadership(CoreDescriptor cd)
Best effort to give up the leadership of a shard in a core after hitting a tragic exceptionboolean
isConnected()
static void
linkConfSet(org.apache.solr.common.cloud.SolrZkClient zkClient, String collection, String confSetName)
boolean
pathExists(String path)
Returns true if the path existsstatic int
persistConfigResourceToZooKeeper(ZkSolrResourceLoader zkLoader, int znodeVersion, String resourceName, byte[] content, boolean createIfNotExists)
Persists a config file to ZooKeeper using optimistic concurrency.void
preClose()
void
preRegister(CoreDescriptor cd, boolean publishState)
void
publish(CoreDescriptor cd, org.apache.solr.common.cloud.Replica.State state)
void
publish(CoreDescriptor cd, org.apache.solr.common.cloud.Replica.State state, boolean updateLastState, boolean forcePublish)
Publish core state to overseer.void
publishAndWaitForDownStates()
void
publishAndWaitForDownStates(int timeoutSeconds)
Collection<String>
publishNodeAsDown(String nodeName)
Best effort to set DOWN state for all replicas on node.String
register(String coreName, CoreDescriptor desc, boolean skipRecovery)
Register shard with ZooKeeper.String
register(String coreName, CoreDescriptor desc, boolean recoverReloadedCores, boolean afterExpiration, boolean skipRecovery)
Register shard with ZooKeeper.void
registerConfListenerForCore(String confDir, SolrCore core, Runnable listener)
This will give a callback to the listener whenever a child is modified in the conf directory.void
rejoinOverseerElection(String electionNode, boolean joinAtHead)
void
rejoinShardLeaderElection(org.apache.solr.common.params.SolrParams params)
void
removeEphemeralLiveNode()
void
removeOnReconnectListener(org.apache.solr.common.cloud.OnReconnect listener)
Removed a previously registered OnReconnect listener, such as when a core is removed or reloaded.void
setPreferredOverseer()
void
startReplicationFromLeader(String coreName, boolean switchTransactionLog)
void
stopReplicationFromLeader(String coreName)
void
throwErrorIfReplicaReplaced(CoreDescriptor desc)
static void
touchConfDir(ZkSolrResourceLoader zkLoader)
static String
trimLeadingAndTrailingSlashes(String in)
Utility method for trimming and leading and/or trailing slashes from its input.void
tryCancelAllElections()
Attempts to cancel all leader elections.void
unregister(String coreName, CoreDescriptor cd)
void
unregister(String coreName, CoreDescriptor cd, boolean removeCoreFromZk)
-
-
-
Field Detail
-
COLLECTION_PARAM_PREFIX
public static final String COLLECTION_PARAM_PREFIX
- See Also:
- Constant Field Values
-
CONFIGNAME_PROP
public static final String CONFIGNAME_PROP
- See Also:
- Constant Field Values
-
TOUCHED_ZNODE_DATA
public static final byte[] TOUCHED_ZNODE_DATA
-
zkStateReader
public final org.apache.solr.common.cloud.ZkStateReader zkStateReader
-
overseer
protected volatile Overseer overseer
-
-
Constructor Detail
-
ZkController
public ZkController(CoreContainer cc, String zkServerAddress, int zkClientConnectTimeout, CloudConfig cloudConfig, Supplier<List<CoreDescriptor>> descriptorsSupplier) throws InterruptedException, TimeoutException, IOException
- Parameters:
cc
- Core container associated with this controller. cannot be null.zkServerAddress
- where to connect to the zk serverzkClientConnectTimeout
- timeout in mscloudConfig
- configuration for this controller. TODO: possibly redundant with CoreContainerdescriptorsSupplier
- a supplier of the current core descriptors. used to know which cores to re-register on reconnect- Throws:
InterruptedException
TimeoutException
IOException
-
-
Method Detail
-
getLeaderVoteWait
public int getLeaderVoteWait()
-
getLeaderConflictResolveWait
public int getLeaderConflictResolveWait()
-
getSysPropsCacher
public org.apache.solr.common.cloud.NodesSysPropsCacher getSysPropsCacher()
-
preClose
public void preClose()
-
close
public void close()
Closes the underlying ZooKeeper client.- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
-
giveupLeadership
public void giveupLeadership(CoreDescriptor cd)
Best effort to give up the leadership of a shard in a core after hitting a tragic exception- Parameters:
cd
- The current core descriptor
-
getClusterState
public org.apache.solr.common.cloud.ClusterState getClusterState()
- Returns:
- information about the cluster from ZooKeeper
-
getDistributedClusterStateUpdater
public DistributedClusterStateUpdater getDistributedClusterStateUpdater()
-
getSolrCloudManager
public org.apache.solr.client.solrj.cloud.SolrCloudManager getSolrCloudManager()
-
getHostName
public String getHostName()
-
getHostPort
public int getHostPort()
-
getZkClient
public org.apache.solr.common.cloud.SolrZkClient getZkClient()
-
getZkServerAddress
public String getZkServerAddress()
- Returns:
- zookeeper server address
-
createClusterZkNodes
public static void createClusterZkNodes(org.apache.solr.common.cloud.SolrZkClient zkClient) throws org.apache.zookeeper.KeeperException, InterruptedException, IOException
Create the zknodes necessary for a cluster to operate- Parameters:
zkClient
- a SolrZkClient- Throws:
org.apache.zookeeper.KeeperException
- if there is a Zookeeper errorInterruptedException
- on interruptIOException
-
publishAndWaitForDownStates
public void publishAndWaitForDownStates() throws org.apache.zookeeper.KeeperException, InterruptedException
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
publishAndWaitForDownStates
public void publishAndWaitForDownStates(int timeoutSeconds) throws InterruptedException
- Throws:
InterruptedException
-
checkChrootPath
public static boolean checkChrootPath(String zkHost, boolean create) throws org.apache.zookeeper.KeeperException, InterruptedException
Validates if the chroot exists in zk (or if it is successfully created). Optionally, if create is set to true this method will create the path in case it doesn't exist- Returns:
- true if the path exists or is created false if the path doesn't exist and 'create' = false
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
isConnected
public boolean isConnected()
-
removeEphemeralLiveNode
public void removeEphemeralLiveNode() throws org.apache.zookeeper.KeeperException, InterruptedException
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
getNodeName
public String getNodeName()
-
pathExists
public boolean pathExists(String path) throws org.apache.zookeeper.KeeperException, InterruptedException
Returns true if the path exists- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
register
public String register(String coreName, CoreDescriptor desc, boolean skipRecovery) throws Exception
Register shard with ZooKeeper.- Returns:
- the shardId for the SolrCore
- Throws:
Exception
-
register
public String register(String coreName, CoreDescriptor desc, boolean recoverReloadedCores, boolean afterExpiration, boolean skipRecovery) throws Exception
Register shard with ZooKeeper.- Returns:
- the shardId for the SolrCore
- Throws:
Exception
-
startReplicationFromLeader
public void startReplicationFromLeader(String coreName, boolean switchTransactionLog)
-
stopReplicationFromLeader
public void stopReplicationFromLeader(String coreName)
-
getLeaderProps
public org.apache.solr.common.cloud.ZkCoreNodeProps getLeaderProps(String collection, String slice, int timeoutms) throws InterruptedException, org.apache.zookeeper.KeeperException.SessionExpiredException
Get leader props directly from zk nodes.- Throws:
org.apache.zookeeper.KeeperException.SessionExpiredException
- on zk session expiration.InterruptedException
-
getLeaderProps
public org.apache.solr.common.cloud.ZkCoreNodeProps getLeaderProps(String collection, String slice, int timeoutms, boolean failImmediatelyOnExpiration) throws InterruptedException, org.apache.zookeeper.KeeperException.SessionExpiredException
Get leader props directly from zk nodes.- Returns:
- leader props
- Throws:
org.apache.zookeeper.KeeperException.SessionExpiredException
- on zk session expiration.InterruptedException
-
getBaseUrl
public String getBaseUrl()
-
publish
public void publish(CoreDescriptor cd, org.apache.solr.common.cloud.Replica.State state) throws org.apache.zookeeper.KeeperException, InterruptedException
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
publish
public void publish(CoreDescriptor cd, org.apache.solr.common.cloud.Replica.State state, boolean updateLastState, boolean forcePublish) throws org.apache.zookeeper.KeeperException, InterruptedException
Publish core state to overseer.- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
getShardTerms
public ZkShardTerms getShardTerms(String collection, String shardId)
-
clearZkCollectionTerms
public void clearZkCollectionTerms()
-
unregister
public void unregister(String coreName, CoreDescriptor cd) throws Exception
- Throws:
Exception
-
unregister
public void unregister(String coreName, CoreDescriptor cd, boolean removeCoreFromZk) throws Exception
- Throws:
Exception
-
getZkStateReader
public org.apache.solr.common.cloud.ZkStateReader getZkStateReader()
-
getCoreNodeName
public String getCoreNodeName(CoreDescriptor descriptor)
-
preRegister
public void preRegister(CoreDescriptor cd, boolean publishState)
-
tryCancelAllElections
public void tryCancelAllElections()
Attempts to cancel all leader elections. This method should be called on node shutdown.
-
linkConfSet
public static void linkConfSet(org.apache.solr.common.cloud.SolrZkClient zkClient, String collection, String confSetName) throws org.apache.zookeeper.KeeperException, InterruptedException
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
getOverseerJobQueue
public ZkDistributedQueue getOverseerJobQueue()
-
getOverseerCollectionQueue
public OverseerTaskQueue getOverseerCollectionQueue()
-
getOverseerConfigSetQueue
public OverseerTaskQueue getOverseerConfigSetQueue()
-
getOverseerRunningMap
public DistributedMap getOverseerRunningMap()
-
getOverseerCompletedMap
public DistributedMap getOverseerCompletedMap()
-
getOverseerFailureMap
public DistributedMap getOverseerFailureMap()
-
claimAsyncId
public boolean claimAsyncId(String asyncId) throws org.apache.zookeeper.KeeperException
When an operation needs to be performed in an asynchronous mode, the asyncId needs to be claimed by calling this method to make sure it's not duplicate (hasn't been claimed by other request). If this method returns true, the asyncId in the parameter has been reserved for the operation, meaning that no other thread/operation can claim it. If for whatever reason, the operation is not scheduled, the asuncId needs to be cleared usingclearAsyncId(String)
. If this method returns false, no reservation has been made, and this asyncId can't be used, since it's being used by another operation (currently or in the past)- Parameters:
asyncId
- A string representing the asyncId of an operation. Can't be null.- Returns:
- True if the reservation succeeds. False if this ID is already in use.
- Throws:
org.apache.zookeeper.KeeperException
-
clearAsyncId
public boolean clearAsyncId(String asyncId) throws org.apache.zookeeper.KeeperException
Clears an asyncId previously claimed by callingclaimAsyncId(String)
- Parameters:
asyncId
- A string representing the asyncId of an operation. Can't be null.- Returns:
- True if the asyncId existed and was cleared. False if the asyncId didn't exist before.
- Throws:
org.apache.zookeeper.KeeperException
-
getOverseer
public Overseer getOverseer()
-
getOverseerElector
public LeaderElector getOverseerElector()
-
trimLeadingAndTrailingSlashes
public static String trimLeadingAndTrailingSlashes(String in)
Utility method for trimming and leading and/or trailing slashes from its input. May return the empty string. May return null if and only if the input is null.
-
rejoinOverseerElection
public void rejoinOverseerElection(String electionNode, boolean joinAtHead)
-
rejoinShardLeaderElection
public void rejoinShardLeaderElection(org.apache.solr.common.params.SolrParams params)
-
checkOverseerDesignate
public void checkOverseerDesignate()
-
setPreferredOverseer
public void setPreferredOverseer() throws org.apache.zookeeper.KeeperException, InterruptedException
- Throws:
org.apache.zookeeper.KeeperException
InterruptedException
-
getCoreContainer
public CoreContainer getCoreContainer()
-
throwErrorIfReplicaReplaced
public void throwErrorIfReplicaReplaced(CoreDescriptor desc)
-
addOnReconnectListener
public void addOnReconnectListener(org.apache.solr.common.cloud.OnReconnect listener)
Add a listener to be notified once there is a new session created after a ZooKeeper session expiration occurs; in most cases, listeners will be components that have watchers that need to be re-created.
-
removeOnReconnectListener
public void removeOnReconnectListener(org.apache.solr.common.cloud.OnReconnect listener)
Removed a previously registered OnReconnect listener, such as when a core is removed or reloaded.
-
persistConfigResourceToZooKeeper
public static int persistConfigResourceToZooKeeper(ZkSolrResourceLoader zkLoader, int znodeVersion, String resourceName, byte[] content, boolean createIfNotExists)
Persists a config file to ZooKeeper using optimistic concurrency.- Returns:
- true on success
-
touchConfDir
public static void touchConfDir(ZkSolrResourceLoader zkLoader)
-
registerConfListenerForCore
public void registerConfListenerForCore(String confDir, SolrCore core, Runnable listener)
This will give a callback to the listener whenever a child is modified in the conf directory. It is the responsibility of the listener to check if the individual item of interest has been modified. When the last core which was interested in this conf directory is gone the listeners will be removed automatically.
-
getConfigDirListener
public org.apache.solr.common.cloud.OnReconnect getConfigDirListener()
-
checkIfCoreNodeNameAlreadyExists
public boolean checkIfCoreNodeNameAlreadyExists(CoreDescriptor dcore)
-
publishNodeAsDown
public Collection<String> publishNodeAsDown(String nodeName)
Best effort to set DOWN state for all replicas on node.- Parameters:
nodeName
- to operate on- Returns:
- the names of the collections that have replicas on the given node
-
-