Backing up and Restoring TrackMe
Introduction to backup and restore
TrackMe is a complex Splunk application, backing up and restoring means that the following need to be taken into account:
TrackMe creates Splunk Knowledge Objects on the fly, this includes KVstore definitions, different types of Splunk transforms, alerts and reports, etc…
TrackMe heavily relies on KVstores, the application stores various states as well as its own data and configuration in KVstores.
When backing up and restoring TrackMe, both aspects have to be managed together, you cannot restore TrackMe configuration files without restoring the KVstore data and vice versa.
This process is simpler in a standalone Search Head, but is also fully relevant in a Search Head Cluster context.
We focus on Splunk Enterprise deployments, Splunk Cloud customer may require the participation of Splunk Cloud Support (required for file-system level restoration as well as KVstore restoration using Splunk level backups, but KVstore can restored using TrackMe builtin backups freely).
KVstore backup and restore: Splunk level backups versus TrackMe level backups
- TrackMe KVstore collections can be backed up and restored in two ways:
Using Splunk level backup and restore capabilities.
Using TrackMe builtin backup and restore capabilities.
Using Splunk level KVstore backup and restore is less flexible, and has the inconvenience of impacting all collections, and not only TrackMe KVstore collections.
On the contrary, TrackMe built in KVstore backup and restore will only impact TrackMe collections, with more flexibility.
Finally, note that TrackMe backups are automatically scheduled as soon as Virtual Tenants are created, so TrackMe is already taking care of this for you.
Backing up TrackMe
Backing up TrackMe configuration files
This step is actually pretty simple:
TrackMe stores all its configuration in its local directory, on the file-system this would be reflected as:
Note: we will assume $SPLUNK_HOME equals to /opt/splunk
/opt/splunk/etc/apps/trackme/local
A simple approach would be to create a compressed tarball of the full TrackMe installation directory, which could intervene before you perform an upgrade:
Example:
cd /opt/splunk/etc/apps
tar -cpzf /my_backups/trackme_backup_YYYYMMDD.tgz trackme
Backing up TrackMe KVstore collections data
Using Splunk level Kvstore backup
Splunk can backup all Kvstore collections, consult:
https://docs.splunk.com/Documentation/Splunk/latest/Admin/BackupKVstore
As for an example, one would run:
/opt/splunk/bin/splunk backup kvstore -pointInTime true
Backup files are created at the root of the SPLUNK_DB (see: /opt/splunk/etc/splunk-launch.conf)
Example:
/opt/splunk/var/lib/kvstorebackup/
-rw------- 1 splunk splunk 278643 Apr 2 17:09 kvdump_1712077797.tar.gz
You can call the following command to get some information about a backup dump:
/opt/splunk/bin/splunk show kvstore -archiveName <file_name>
Example:
/opt/splunk/bin/splunk show kvstore -archiveName kvdump_1712077797.tar.gz
Example results:
KV Store archive info:
backup_method : mongodump
collection_count : 172
created_at : Tue Apr 2 17:09:57 2024
mongodb_version : 4.2.17-linux-splunk-v4
online_backup : yes
splunk_version : 9.2.0
storage_engine : wiredTiger
Using TrackMe level Kvstore backup
TrackMe also backups itself its KVstore collections, the backup is made on a per schedule basis once per day, you can consult the builtin dashboard for more information about TrackMe backups:
TrackMe -> Menu -> API & Tooling -> TrackMe Backup and Restore
In a simple line of SPL, you can take a backup immediately:
| trackme url="/services/trackme/v2/backup_and_restore/backup" mode="post"
TrackMe says very clearly what backups are available, on which servers and which location on the file-system, these files will be located in the backup directory at the root of TrackMe:
Example:
splunk@trackme-demo-main:/data$ ls -ltr /opt/splunk/etc/apps/trackme/backup/
total 536
-rw------- 1 splunk splunk 170972 Apr 1 02:00 trackme-backup-20240401-020006.tgz
-rw------- 1 splunk splunk 184026 Apr 2 02:00 trackme-backup-20240402-020006.tgz
-rw------- 1 splunk splunk 189964 Apr 2 17:24 trackme-backup-20240402-182408.tgz
file-system backup + TrackMe Kvstore collections backup
It is worth mentioning that if you take a backup of the at the file-system level as mentioned in step 1, you also naturally include the TrackMe backup files of the KVstore collections.
Restoring TrackMe
If you need to restore TrackMe to an earlier stage, you will need to restore both the configuration files and the KVstore collections data.
In the following section, we will explore two scenarios, first restoring using Splunk level KVstore backup, and second restoring using TrackMe level KVstore backup.
We will assume that we have a ready backup of the trackme application directory available at:
/data/backups/trackme_backup_20240402.tgz
Downtime and errors while we process with TrackMe restoration
It is obvious that TrackMe may not be properly working during the restoration process.
In fact, and depending on your context, TrackMe may for instance be unable to load Virtual Tenants configuration, and will generate all sorts of errors in its logs.
This will be a temporary situation until the restoration is completed.
Restoring TrackMe configuration files using the file-system level backup
If running in a Standalone instance, stop Splunk.
If running in a Search Head Cluster, things are tickier, you will need to stop all members, remove if necessary the TrackMe directory and restore the backup, then restart all members. You will also need to ensure that the Search Head Cluster deployer publishes the right TrackMe version depending on your needs.
For simplicity, we will assume a standalone instance in the next steps.
Stop Splunk: (adapt if your systemctl service name is different)
sudo systemctl stop splunk
Clean and restore the TrackMe directory:
cd /opt/splunk/etc/apps
rm -rf trackme
tar -xpzf /data/backups/trackme_backup_20240402.tgz
Start Splunk: (adapt if your systemctl service name is different)
sudo systemctl start splunk
KVstore restoration option 1: Using Splunk level Kvstore backup
Review the Splunk documentation for further details:*
Enable Kvstore maintenance mode:
/opt/splunk/bin/splunk enable kvstore-maintenance-mode
Restore:
/opt/splunk/bin/splunk restore kvstore -pointInTime True -archiveName kvdump_1712077797.tar.gz
Verify:
/opt/splunk/bin/splunk show kvstore-status
Disable Kvstore maintenance mode:
/opt/splunk/bin/splunk disable kvstore-maintenance-mode
Not necessarily required, but cannot hurt and allows starting clean, restart Splunk:
sudo systemctl restart splunk
TrackMe should not be restored and operational
At this stage, TrackMe restoration is completed, the service should be up and running, and operational.
Review TrackMe behaviours, verify tenants are not degraded, and all tenants can be loaded successfully.
KVstore restoration option 2: Using TrackMe level Kvstore backups
Restoring TrackMe Kvstore collections using TrackMe builtin capabilities
TrackMe can restore its own collections easily, in a single line of SPL.
You can optionally target all TrackMe collections, or a specific collection, in our global restoration context, we want to restore all collections.
These restore operations require Splunk to up and running.
Locate the backup file you want to restore, example:
splunk@trackme-demo-main:/opt/splunk/etc/apps$ ls -ltr trackme/backup/
total 728
-rw------- 1 splunk splunk 170972 Apr 1 02:00 trackme-backup-20240401-020006.tgz
-rw------- 1 splunk splunk 184026 Apr 2 02:00 trackme-backup-20240402-020006.tgz
-rw------- 1 splunk splunk 189964 Apr 2 17:24 trackme-backup-20240402-182408.tgz
-rw------- 1 splunk splunk 194490 Apr 2 19:35 trackme-backup-20240402-203538.tgz
Run the restore operation, we can first run a dry run which verifies the consistency of the compressed archive:
| trackme url="/services/trackme/v2/backup_and_restore/restore" mode="post" body="{'backup_archive': 'trackme-backup-20240402-203538.tgz', 'dry_run': 'true', 'target': 'all'}"
Run the actual restore operation:
| trackme url="/services/trackme/v2/backup_and_restore/restore" mode="post" body="{'backup_archive': 'trackme-backup-20240402-203538.tgz', 'dry_run': 'false', 'target': 'all'}"
Not necessarily required, but cannot hurt and allows starting clean, restart Splunk:
sudo systemctl restart splunk
TrackMe should not be restored and operational
At this stage, TrackMe restoration is completed, the service should be up and running, and operational.
Review TrackMe behaviours, verify tenants are not degraded, and all tenants can be loaded successfully.