<div dir="ltr"><div>Hi,<br><br>I'm evaluating the right backup service for my needs. This boils down to comparing Restic and Borg as they represent the "top of the shelf" solutions currently available on Linux.<br><br>Borg : 1.1.0rc3 (from the borg-linux64 binary)<br>Restic : 0.7.1 (from the restic linux_amd64 binary)<br><br>The backup data consists of a live mail repository using the maildir format and holding 139 GB (2327 dirs, 665456 files).<br><br>Keep in mind that this is the result of my own experience, for my own needs and this is in no way thorough nor exhaustive.<br><br>BORG :<br>======<br><br>Note: Encryption is "repokey-blake2"<br><br>* First pass<br><br>Shell# time ./borg-linux64 create --info --stats --progress /path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/<br>Enter passphrase for key /path/to/BackupTests/Borg:<br>------------------------------------------------------------------------------<br>Archive name: 20170911-164308<br>Archive fingerprint: ea043fb5154c60ecdcb42e3be238cfa2ad040e03349f5ae5cab6a9f9f8fd48fe<br>Time (start): Mon, 2017-09-11 16:43:11<br>Time (end): Mon, 2017-09-11 17:58:32<br>Duration: 1 hours 15 minutes 20.67 seconds<br>Number of files: 646835<br>Utilization of max. archive size: 0%<br>------------------------------------------------------------------------------<br> Original size Compressed size Deduplicated size<br>This archive: 137.47 GB 112.55 GB 102.30 GB<br>All archives: 137.47 GB 112.55 GB 102.30 GB<br><br> Unique chunks Total chunks<br>Chunk index: 639574 680719<br>------------------------------------------------------------------------------<br><br>real 75m33.037s<br>user 23m22.756s<br>sys 3m51.228s<br><br>* Second pass<br><br>Shell# time ./borg-linux64 create --info --stats --progress /path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/<br>Enter passphrase for key /path/to/BackupTests/Borg:<br>------------------------------------------------------------------------------<br>Archive name: 20170911-181622<br>Archive fingerprint: 67e9f0e14fa092d274e99833806ca789eb88df890190ec37cedb5e4af20107a0<br>Time (start): Mon, 2017-09-11 18:16:25<br>Time (end): Mon, 2017-09-11 18:18:27<br>Duration: 2 minutes 1.73 seconds<br>Number of files: 646861<br>Utilization of max. archive size: 0%<br>------------------------------------------------------------------------------<br> Original size Compressed size Deduplicated size<br>This archive: 137.47 GB 112.55 GB 14.57 MB<br>All archives: 274.94 GB 225.10 GB 102.32 GB<br><br> Unique chunks Total chunks<br>Chunk index: 639728 1361461<br>------------------------------------------------------------------------------<br><br>real 2m13.070s<br>user 1m55.448s<br>sys 0m12.652s<br><br><br>RESTIC :<br>========<br><br>* Fist pass<br><br>Shell# time ./restic_0.7.1_linux_amd64 backup -r /path/to/BackupTests/Restic/ /path/to/Mail/<br>enter password for repository:<br>scan [/path/to/Mail]<br>scanned 2327 directories, 665464 files in 0:02<br>[1:48:16] 100.00% 21.440 MiB/s 136.009 GiB / 136.001 GiB 667813 / 667791 items 0 errors ETA 0:00<br>duration: 1:48:16, 21.44MiB/s<br>snapshot 9abedefd saved<br><br>real 108m23.314s<br>user 48m2.328s<br>sys 6m12.984s<br><br>* Second pass<br><br>Shell# time ./restic_0.7.1_linux_amd64 -r /path/to/BackupTests/Restic/ backup /path/to/Mail/<br>enter password for repository:<br>using parent snapshot 9abedefd<br>scan [/path/to/Mail]<br>scanned 2327 directories, 665575 files in 0:04<br>[0:47] 100.00% 2.855 GiB/s 136.010 GiB / 136.010 GiB 667902 / 667902 items 0 errors ETA 0:00<br>duration: 0:47, 2920.94MiB/s<br>snapshot 6c90edf6 saved<br><br>real 0m55.859s<br>user 2m10.312s<br>sys 0m9.364s<br><br></div><div>BORG vs RESTIC on Backup :<br>==========================<br><br>- Borg is way faster on first pass (1h15m vs 1h48m) but significantly slower on second pass (2m1s vs 47s)<br><br>- Borg repo size (103 GB) is smaller than Restic repo size (121 GB)<br><br>BORG vs RESTIC on mounted archives :<br>====================================<br><br>* Simple access to the mounted repositories :<br><br>Shell# time ls -l BorgMount/20170911-181622/<br>total 0<br>drwxr-xr-x 1 root root 0 sept. 11 19:48 path<br><br>real 0m22.383s<br>user 0m0.000s<br>sys 0m0.000s<br><br>Shell# time ls -l ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/<br>total 0<br>drwx------ 3 mail mail 0 déc. 7 2014 Mail<br><br>real 0m0.003s<br>user 0m0.000s<br>sys 0m0.000s<br><br>- Borg needs 22 seconds to internally build the directory tree, Restic is instant.<br><br>- Interesting note : The first visible directory is exactly the specified backup path "path" (/path/to/Mail) for Borg whereas Restic only keeps the last path component "Mail" (/path/to/Mail).<br><br>* Extract some path from the mounted repositories :<br><br>- Shell# time cp -a BorgMount/20170911-181622/path/to/[...]/Trash BorgRestore/<br><br>real 3m36.534s<br>user 0m0.396s<br>sys 0m7.944s<br><br>NOTE: CPU usage was spiking at 100% when no disk activity (building internal listings is my guess) and jumping between 31~67% for disk activity (actual copy process)<br><br>- Shell# time cp -a ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/[...]/Trash ResticRestore/<br><br>real 6m23.970s<br>user 0m0.496s<br>sys 0m13.708s<br><br>NOTE: CPU usage never spikes and constantly jumps between 21~53% for the whole process<br><br>- The "Trash" directory is 6.3 GB big with 47945 files in it.<br><br>- Borg is faster by a factor of 2 to restore the exact same data using about 2x more CPU.<br><br><br>* Fetch deep info on mounted repositories :<br><br>Shell# time du -s --si ResticMount/snapshots/2017-09-11T18\:15\:18+02\:00/<br>147G ResticMount/snapshots/2017-09-11T18:15:18+02:00/<br><br>real 1m18.590s<br>user 0m0.800s<br>sys 0m4.036s<br><br>NOTE: CPU usage around 46% for the whole process<br><br>Shell# time du -s --si BorgMount/20170911-181622/<br>138G BorgMount/20170911-181622/<br><br>real 5m30.143s<br>user 0m0.864s<br>sys 0m4.956s<br><br>NOTE: CPU usage at 100% for the whole process<br><br>- BORG is about 5x slower to get the same information</div><div><br></div><div><br>BORG vs RESTIC trying to backup while having mounted archives :<br>===============================================================<br><br>NOTE: Typical use case would be trying to restore a very big file in a very nested/complex directory hierarchy that would make this impractical using the "extract/restore" command. Retrieving the said file would be so time consuming that it would overlap with the next scheduled backup for example.<br><br>Shell# ./borg-linux64 create --info --stats --progress /path/to/BackupTests/Borg::{now:%Y%m%d-%H%M%S} /path/to/Mail/<br>Failed to create/acquire the lock /path/to/BackupTests/Borg/lock (timeout).<br><br>Shell# ./restic_0.7.1_linux_amd64 -r /path/to/BackupTests/Restic backup /path/to/Mail/<br>enter password for repository: <br>using parent snapshot 6c90edf6<br>scan [/path/to/Mail]<br>scanned 2327 directories, 665655 files in 0:03<br>[0:38] 100.00% 3.518 GiB/s 136.039 GiB / 136.039 GiB 667982 / 667982 items 0 errors ETA 0:00<br>duration: 0:38, 3581.52MiB/s<br>snapshot 64106e49 saved<br><br>NOTE: Here the Restic design have a clear advantage. Quoting the doc : "All files in a repository are only written once and never modified afterwards. This allows accessing and even writing to the repository with multiple clients in parallel".<br><br><br>Features I like in Restic :<br>===========================<br><br>- Nothing is written outside the repository<br>- The "views" on mounted repositories (host, snapshots and tags)<br>- Multiple "keys" per repository (like LUKS)<br><br><br>Features I *DISLIKE* in Restic :<br>================================<br><br>- The "mount" command blocks the shell and waits for "CTRL-C" to end. Trying to unmount while mountpoint is busy (ie: cd /path/to/mountpoint in a different shell) ends up badly :<br>"unable to umount (maybe already umounted?): exit status 1: fusermount: failed to unmount /path/to/BackupTests/ResticMount: Device or resource busy"<br>and needs manual intervention. Borg also complains but a second invocation to "umount" command works once the "business" state is lifted.<br>- No support for sparse files (AFAIK) which makes it not usable for VM images and such.<br><br><br>Features I *DISLIKE* in Borg :<br>==============================<br><br>- Writes several files OUTSIDE the repository, ~/.config/borg and ~/.cache/borg and AFAIK, there's no option to use another paths for these files.<br>- The "several seconds or more" delays when mounting repositories and scanning deeper directories.<br></div><div><br></div><div>I can live with the delays but I really wish there was an option to relocate the ".config" and ".cache" data. I need this because it makes it easier to copy the data offsite without forgetting anything! I know that ".cache" is disposable bug having this data available when restoring in case of disaster recovery is a huge gain of time.<br></div><div><br>Features I *DISLIKE* in BOTH tools :<br>====================================<br><br>- Their design geared at "backup-and-push-to-repository" which is nice but not desired in my environment. I need a "repository-pulls-backup-from-agent" design. There could be in both tools an additional "agent" command that would :<br> * Use ssh transport by default to contact an host and the ssh keys benefits (authorized keys, )<br> * Spawn a Borg/Restic instance to make the backup on the remote host (like a normal Borg call) but feed the result back to the calling Borg, which holds the repository<br> * A way to securely transmit the repokey data to the remote instance so the local Borg can mount/check the local repository<br><br> Of course, it would be of the administrator responsability to setup everything accordingly to use either one repokey for every remote host or script something a bit smarter to use a repokey per host or group of hosts, whatever suits the needs.<br><br> Why such a setup?<br><br> Because, in my case at least, the backup server is of critical importance and network isolated from the other hosts. I really don't want the "all-hosts-can-contact-the-backup-server" style but the "only-backup-server-can-contact-hosts" kind of behavior. This also helps to limit the strain on the backup server. Having all the hosts, with no predictable backup size, hammering the backup server at the same time (cronjob) is not desirable, especially on sites with storage on budget :-)<br><br> For instance, I currently use a very spartan/crude system but which is rock solid and never failed once in over two decades. A simple script which, in sequence, connects via SSH to each host and uses the remote tar command to perform the backup. SSH's piped stdout/stderr allows to retrieve the tarball as well as errors and act accordingly. This is not scalable but highly effective, battle tested and disaster recovery proven! Booting a new server with some rescue OS and restoring from a tarball works in ALL conditions, no matter how long it takes :-) But now, I need encryption and deduplication given the huge sizes of the data to backup, hence my tests with Borg/Restic which both have nice features *AND* provide a single file binary for disaster scenarios.<br><br clear="all"><br>-- <br><div class="gmail_signature">Unix _IS_ user friendly, it's just selective about who its friends are.</div>
</div></div>