CDCR Configuration | Apache Solr Reference Guide 7.3

The Source and Target configurations differ in the case of the data centers being in separate clusters. "Cluster" here means separate ZooKeeper ensembles controlling disjoint Solr instances. Whether these data centers are physically separated or not is immaterial for this discussion.

As described in the section CDCR Architecture, two approaches are supported: uni-directional updates and bi-directional updates.

All CDCR configuration is done in the solrconfig.xml file. Because this is a per-collection configuration file, all CDCR configuration is done for each collection.

Uni-Directional Updates

Source Configuration

Here is a sample of a Source configuration file, a section in solrconfig.xml. The presence of the <replica> section causes CDCR to use this cluster as the Source and it should not be present in the Target collections. Details about each setting are after the two examples. The source example has buffering disabled, the default is enabled:

<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
  <lst name="replica">
    <str name="zkHost">10.240.18.211:2181,10.240.18.212:2181</str>
    <!--
    If you have chrooted your Solr information at the target you must include the chroot, for example:
    <str name="zkHost">10.240.18.211:2181,10.240.18.212:2181/solr</str>
    -->
    <str name="source">collection1</str>
    <str name="target">collection1</str>
  </lst>

  <lst name="replicator">
    <str name="threadPoolSize">8</str>
    <str name="schedule">1000</str>
    <str name="batchSize">128</str>
  </lst>

  <lst name="updateLogSynchronizer">
    <str name="schedule">1000</str>
  </lst>

</requestHandler>

<!-- Modify the <updateLog> section of your existing <updateHandler>
     in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
  <updateLog class="solr.CdcrUpdateLog">
    <str name="dir">${solr.ulog.dir:}</str>
    <!--Any parameters from the original <updateLog> section -->
  </updateLog>

  <!-- Other configuration options such as autoCommit should still be present -->
</updateHandler>

Target Configuration

Here is a typical Target configuration.

Target instance must configure an update processor chain that is specific to CDCR. The update processor chain must include the CdcrUpdateProcessorFactory. The task of this processor is to ensure that the version numbers attached to update requests coming from a CDCR Source SolrCloud are reused and not overwritten by the Target. A properly configured Target configuration looks similar to this:

<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
  <!-- recommended for Target clusters -->
  <lst name="buffer">
    <str name="defaultState">disabled</str>
  </lst>
</requestHandler>

<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">cdcr-processor-chain</str>
  </lst>
</requestHandler>

<updateRequestProcessorChain name="cdcr-processor-chain">
  <processor class="solr.CdcrUpdateProcessorFactory"/>
  <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

<!-- Modify the <updateLog> section of your existing <updateHandler> in your
    config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
  <updateLog class="solr.CdcrUpdateLog">
    <str name="dir">${solr.ulog.dir:}</str>
    <!--Any parameters from the original <updateLog> section -->
  </updateLog>

  <!-- Other configuration options such as autoCommit should still be present -->

</updateHandler>

Bi-Directional Updates

The configurations in both Cluster 1 and 2 are identical with respective zkHost string specified in each cluster’s solrconfig.xml.

Both Cluster 1 and Cluster 2 can act as Source and Target at any given point of time but a cluster cannot be both Source and Target at the same time.

Cluster 1 Configuration

Here is a sample of a Cluster 1 configuration file, a section in solrconfig.xml. Cluster 2 zkhost string is specified in a CdcrRequestHandler declaration:

<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">cdcr-processor-chain</str>
  </lst>
</requestHandler>

<updateRequestProcessorChain name="cdcr-processor-chain">
  <processor class="solr.CdcrUpdateProcessorFactory"/>
  <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
  <lst name="replica">
    <str name="zkHost">10.240.19.241:2181,10.240.19.242:2181</str>
    <!--
    If you have chrooted your Solr information at the target you must include the chroot, for example:
    <str name="zkHost">10.240.19.241:2181,10.240.19.242:2181/solr</str>
    -->
    <str name="source">collection1</str>
    <str name="target">collection1</str>
  </lst>

  <lst name="replicator">
    <str name="threadPoolSize">8</str>
    <str name="schedule">1000</str>
    <str name="batchSize">128</str>
  </lst>

  <lst name="updateLogSynchronizer">
    <str name="schedule">1000</str>

</requestHandler>

<!-- Modify the <updateLog> section of your existing <updateHandler>
     in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
  <updateLog class="solr.CdcrUpdateLog">
    <str name="dir">${solr.ulog.dir:}</str>
    <!--Any parameters from the original <updateLog> section -->
  </updateLog>
</updateHandler>

Cluster 2 Configuration

The configuration of the 2nd cluster is identical to the configuration of Cluster 1, with the Cluster 1 zkHost string specified in CdcrRequestHandler definition:

<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <lst name="defaults">
    <str name="update.chain">cdcr-processor-chain</str>
  </lst>
</requestHandler>

<updateRequestProcessorChain name="cdcr-processor-chain">
  <processor class="solr.CdcrUpdateProcessorFactory"/>
  <processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>

<requestHandler name="/cdcr" class="solr.CdcrRequestHandler">
  <lst name="replica">
    <str name="zkHost">10.250.18.211:2181,10.250.18.212:2181</str>
    <!--
    If you have chrooted your Solr information at the target you must include the chroot, for example:
    <str name="zkHost">10.250.18.211:2181,10.250.18.212:2181/solr</str>
    -->
    <str name="source">collection1</str>
    <str name="target">collection1</str>
  </lst>

  <lst name="replicator">
    <str name="threadPoolSize">8</str>
    <str name="schedule">1000</str>
    <str name="batchSize">128</str>
  </lst>

  <lst name="updateLogSynchronizer">
    <str name="schedule">1000</str>
  </lst>

</requestHandler>

<!-- Modify the <updateLog> section of your existing <updateHandler>
     in your config as below -->
<updateHandler class="solr.DirectUpdateHandler2">
  <updateLog class="solr.CdcrUpdateLog">
    <str name="dir">${solr.ulog.dir:}</str>
    <!--Any parameters from the original <updateLog> section -->
  </updateLog>
</updateHandler>

CDCR Configuration Parameters

The configuration details, defaults and options are as follows:

The Replica Element

CDCR can be configured to forward update requests to one or more Target collections. A Target collection is defined with a “replica” list as follows:

zkHost: The host address for ZooKeeper of the Target SolrCloud. Usually this is a comma-separated list of addresses to each node in the Target ZooKeeper ensemble. This parameter is required.
Source: The name of the collection on the Source SolrCloud to be replicated. This parameter is required.
Target: The name of the collection on the Target SolrCloud to which updates will be forwarded. This parameter is required.

The Replicator Element

The CDC Replicator is the component in charge of forwarding updates to the replicas. The replicator will monitor the update logs of the Source collection and will forward any new updates to the Target collection.

The replicator uses a fixed thread pool to forward updates to multiple replicas in parallel. If more than one replica is configured, one thread will forward a batch of updates from one replica at a time in a round-robin fashion. The replicator can be configured with a “replicator” list as follows:

threadPoolSize: The number of threads to use for forwarding updates. One thread per replica is recommended. The default is 2.
schedule: The delay in milliseconds for the monitoring the update log(s). The default is 10.
batchSize: The number of updates to send in one batch. The optimal size depends on the size of the documents. Large batches of large documents can increase your memory usage significantly. The default is 128.

The updateLogSynchronizer Element

Expert: Non-leader nodes need to synchronize their update logs with their leader node from time to time in order to clean deprecated transaction log files. By default, such a synchronization process is performed every minute. The schedule of the synchronization can be modified with a “updateLogSynchronizer” list as follows:

If the updateLogSynchronizer element is omitted from the Source cluster, transaction logs may accumulate on non-leaders.

schedule: The delay in milliseconds for synchronizing the update logs. The default is 60000.

The Buffer Element

When buffering updates, the update logs will store all the updates indefinitely. It is best to disable buffering on both the Source and Target clusters during normal operation as when buffering is enabled the Update Logs will grow without limit. Enbling buffering is intended for special maintenance periods. Buffering can be disabled at startup with a “buffer” list and the parameter “defaultState” as follows:

defaultState: The state of the buffer at startup. The default is enabled.

Buffering should be enabled only for maintenance windows

Buffering is designed to augment maintenance windows. The following points should be kept in mind:

When buffering is enabled, the Update Logs will grow without limit; they will never be purged.
During normal operation, the Update Logs will automatically accrue on the Source data center if the Target data center is unavailable; It is not necessary to enable buffering for CDCR to handle routine network disruptions.
- For this reason, monitoring disk usage on the Source data center is recommended as an additional check that the Target data center is receiving updates.
For uni-directional updates, buffering should not be enabled on the Target data center as Update Logs would accrue without limit.
If buffering is enabled and then disabled, the Update Logs will be removed when their contents have been sent to the Target data center. This process may take some time and is triggered by additional updates the Source cluster.
- Update Log cleanup is not triggered until a new update is sent to the Source data center.

Initial Startup

CDCR Bootstrapping

Solr 6.2 added the functionality to allow CDCR to replicate the entire index from the Source to the Target data centers on first time startup as an alternative to the following procedure. For very large indexes, time should be allocated for the initial synchronization if this option is chosen.

This is a general approach for initializing CDCR in a production environment based upon an approach taken by the initial working installation of CDCR and generously contributed to illustrate a "real world" scenario.

Customer uses the CDCR approach to keep a remote disaster-recovery instance available for production backup. This is a uni-directional solution.
Customer has 26 clouds with 200 million assets per cloud (15GB indexes). Total document count is over 4.8 billion.
- Source and Target clouds were synched in 2-3 hour maintenance windows to establish the base index for the Targets.

As usual, it is good to start small. Sync a single cloud and monitor for a period of time before doing the others. You may need to adjust your settings several times before finding the right balance.

Before starting, stop or pause the indexers. This is best done during a small maintenance window.
Stop the SolrCloud instances at the Source.
Upload the modified solrconfig.xml to ZooKeeper on both Source and Target as appropriate, see the examples above.
Sync the index directories from the Source collection to Target collection across to the corresponding shard nodes. rsync works well for this.

For example, if there are 2 shards on collection1 with 2 replicas for each shard, copy the corresponding index directories from:

shard1replica1Source

to

shard1replica1Target

shard1replica2Source

to

shard1replica2Target

shard2replica1Source

to

shard2replica1Target

shard2replica2Source

to

shard2replica2Target
Start the ZooKeeper on the Target (DR) side.
Start the SolrCloud on the Target (DR) side.
Start the ZooKeeper on the Source side.
Start the SolrCloud on the Source side. As a general rule, the Target (DR) side of the SolrCloud should be started before the Source side.
Activate the CDCR on Source instance using the CDCR API:
```
http://host:port/solr/<collection_name>/cdcr?action=START
```
There is no need to run the /cdcr?action=START command on the Target.

Disable the buffer on the Target and Source:

http://host:port/solr/collection_name/cdcr?action=DISABLEBUFFER

Re-enable indexing.

ZooKeeper Settings

With CDCR, the Target ZooKeepers will have connections from the Target clouds and the Source clouds. You may need to increase the maxClientCnxns setting in zoo.cfg.

## set numbers of connection to 800 from client
## is maxClientCnxns=0 that means no limit
maxClientCnxns=800

CDCR Architecture Cross Data Center Replication Operations