Hey there,
I'm in need of a bit of wisdom here:
I recently got myself a shiny new Vserver and have it running exclusively with docker containers that use regular volumes.
Now: how would I back this thing up? I mean, I have read a lot about borg for example, but I'm not sure how to do it the right way:
Do I just copy / to a repo and fiddle the data back in place if stuff goes wrong, or would it be wiser to backup the volumes, do a docker save and export the images to a folder and then send that to the storage box?
Since docker should stop the containers to prevent data inconsistency during a backup: How do I tell Borg to do that? I've seen several approaches (Borg Dockerized with some sort of access to docker's .sock, Borg setup on the host, and some even wilder approaches).
Since Backups aren't something you can "try again" once you need them, I'd rather have a solution that works.
As a general rule, you should always keep in mind that you're not really looking for a backup solution but rather a restore solution. So think about what you would like to be able to restore, and how you would accomplish that.
For my own use for example, I see very little value in backing up docker containers itself. They're supposed to be ephemeral and easily recreated with build scripts, so I don't use docker save or anything, I just make sure that the build code is safely tucked away in a git repository, which itself is backed up of course. In fact I have a weekly job that tears down and rebuilds all my containers, so my build code is tested and my containers are always up-to-date.
The actual data is in the volumes, so it just lives on a filesystem somewhere. I make sure to have a filesystem backup of that. For data that's in use and which may give inconsistency issues, there are several solutions:
docker stop your containers, create simple filesystem backup, docker start your containers.
Do an LVM level snapshot of the filesystem where your volumes live, and back up the snapshot.
The same but with a btrfs snapshot (I have no experience with this, all my servers just use ext4)
If it's something like a database, you can often export with database specific tools that ensure consistency (e.g. pg_dump, mongodump, mysqldump, ... ), and then backup the resulting dump file.
Most virtualization software have functionality that lets you to take snapshots of whole virtual disk images
As for the OS itself, I guess it depends on how much configuration and tweaking you have done to it and how easy it would be to recreate the whole thing. In case of a complete disaster, I intend to just spin up a new VM, reinstall docker, restore my volumes and then build and spin up my containers. Nevertheless, I still do a full filesystem backup of / and /home as well. I don't intend to use this to recover from a complete disaster, but it can be useful to recover specific files from accidental file deletions.
For my personal stuff, I use docker compose to create bind mounts for all volumes (including databases). These are all within a docker-compose directory. Then I use a cron job to shut down the containers, then zip up the docker-compose directories and start the containers again.
It's important to shut them down so anything in RAM is written to disk.
The way to make sure your backups are working is to regularly restore from them (generally into a VM for me).
I wanted to migrate to a bigger SSD. I did it by backing up my old one with rsnapshot then restoring onto a fresh Linux install on the new SSD. I had to update /etc/fstab to mount my new SSD instead of my old one, and that was it! Easy peasy. Now I have backups and restores tested, too. And then I just set up a cron job to keep backing everything up nightly.
Back up the stuff you wouldn't want to have to do again.
I back up my music thrice, and don't back up my movies or TV unless it's particularly rare or hard to find.
I back up my nginx configs, and my apps folders, with all the docker configs and app files, manually. Every couple of weeks, since having a slightly out of date version isn't that bad.
For everyone here: I decided on restic and wrote a little script that backs up basically all docker volumes after stopping all containers. Now I just gotta collect and save all relevant docker compose scripts somewhere (and incorporate all the little changes I made to the containers after deployment)
I’ve been using Kopia which runs in a docker container and backs up my data to B2. It does daily, weekly, monthly, and yearly copies. You can browse your backups on a file level inside the UI and redownload just what you want or do a full restore. It’s all encrypted in B2 as well. I’ve had to use it to download backups of corrupted SQLite files and I haven’t had a single issue with it yet.
In all the cases for me is sufficient to backup the folder which host the volume for persistent data of each container. I typically do not care to stop and reload containers but nothing prevents you to do so in the backup script. Indeed if a database is concerned the best way is to create a dump and backup that one file. Considering tools, Borg and restic are both great. I am moving progressively from Borg to restic+rclone to exploit free cloud services.