Repository Types
Note all repositories are defined in the SolrCloud specification.
In order to use a repository in the SolrBackup CRD, it must be defined in the SolrCloud spec.
All yaml examples below are SolrCloud resources, not SolrBackup resources.
The Solr-operator currently supports three different backup repository types: Google Cloud Storage ("GCS"), AWS S3 ("S3"), and Volume ("local"). The cloud backup solutions (GCS and S3) are strongly suggested as they are cloud-native backup solutions, however they require newer Solr versions.
Multiple repositories can be defined under the SolrCloud.spec.backupRepositories field.
Specify a unique name and single repo type that you want to connect to.
Repository-type specific options are found under the object named with the repository-type.
Examples can be found below under each repository-type section below.
Feel free to mix and match multiple backup repository types to fit your use case (or multiple repositories of the same type):
spec:
backupRepositories:
- name: "local-collection-backups-1"
volume:
...
- name: "gcs-collection-backups-1"
gcs:
...
- name: "s3-collection-backups-1"
s3:
...
- name: "s3-collection-backups-2"
s3:
...
GCS Backup Repositories
GCS Repositories store backup data remotely in Google Cloud Storage.
This repository type is only supported in deployments that use a Solr version >= 8.9.0.
Each repository must specify the GCS bucket to store data in (the bucket property), and the name of a Kubernetes secret containing credentials for accessing GCS (the gcsCredentialSecret property).
This secret must have a key service-account-key.json whose value is a JSON service account key as described here.
If you already have your service account key, this secret can be created using a command like the one below.
kubectl create secret generic <secretName> --from-file=service-account-key.json=<path-to-service-account-key>
An example of a SolrCloud spec with only one backup repository, with type GCS:
spec:
backupRepositories:
- name: "gcs-backups-1"
gcs:
bucket: "backup-bucket" # Required
gcsCredentialSecret: # Required
name: "secretName"
key: "service-account-key.json"
baseLocation: "/store/here" # Optional
S3 Backup Repositories
S3 Repositories store backup data remotely in AWS S3 (or a supported S3 compatible interface).
This repository type is only supported in deployments that use a Solr version >= 8.10.0.
Each repository must specify an S3 bucket and region to store data in (the bucket and region properties).
Users will want to setup credentials so that the SolrCloud can connect to the S3 bucket and region, more information can be found in the credentials section.
spec:
backupRepositories:
- name: "s3-backups-1"
s3:
region: "us-west-2" # Required
bucket: "backup-bucket" # Required
credentials: {} # Optional
proxyUrl: "https://proxy-url-for-s3:3242" # Optional
endpoint: "https://custom-s3-endpoint:3242" # Optional
Users can also optionally set a proxyUrl or endpoint for the S3Repository.
More information on these settings can be found in the Ref Guide.
S3 Credentials
The Solr S3Repository module uses the default credential chain for AWS.
All of the options below are designed to be utilized by this credential chain.
There are a few options for giving a SolrCloud the credentials for connecting to S3.
The two most straightforward ways can be used via the spec.backupRepositories.s3.credentials property.
spec:
backupRepositories:
- name: "s3-backups-1"
s3:
region: "us-west-2"
bucket: "backup-bucket"
credentials:
accessKeyIdSecret: # Optional
name: aws-secrets
key: access-key-id
secretAccessKeySecret: # Optional
name: aws-secrets
key: secret-access-key
sessionTokenSecret: # Optional
name: aws-secrets
key: session-token
credentialsFileSecret: # Optional
name: aws-credentials
key: credentials
All options in the credentials property are optional, as users can pick and choose which ones to use.
If you have all of your credentials setup in an AWS Credentials File,
then credentialsFileSecret will be the only property you need to set.
However, if you don’t have a credentials file, you will likely need to set at least the accessKeyIdSecret and secretAccessKeySecret properties.
All of these options require the referenced Kuberentes secrets to already exist before creating the SolrCloud resource.
(If desired, all options can be combined. e.g. Use accessKeyIdSecret and credentialsFileSecret together. The ordering of the default credentials chain will determine which options are used.)
The options in the credentials file above merely set environment variables on the pod, or in the case of credentialsFileSecret use an environment variable and a volume mount.
Users can decide to not use the credentials section of the s3 repository config, and instead set these environment variables themselves via spec.customSolrKubeOptions.podOptions.env.
Lastly, if running in EKS, it is possible to add IAM information to Kubernetes serviceAccounts.
If this is done correctly, you will only need to specify the serviceAccount for the SolrCloud pods via spec.customSolrKubeOptions.podOptions.serviceAccount.
| Because the Solr S3 Repository is using system-wide settings for AWS credentials, you cannot specify different credentials for different S3 repositories. This may be addressed in future Solr versions, but for now use the same credentials for all s3 repos. |
Volume Backup Repositories
Volume repositories store backup data "locally" on a Kubernetes volume mounted to each Solr pod. An example of a SolrCloud spec with only one backup repository, with type Volume:
spec:
backupRepositories:
- name: "local-collection-backups-1"
volume:
source: # Required
persistentVolumeClaim:
claimName: "collection-backup-pvc"
directory: "store/here" # Optional
All persistent volumes used with Volume Repositories must have accessMode: ReadWriteMany set, otherwise the backups will not succeed.
|