Mailman 3 du command slow over multiple snapshots - Bit-dev

Jan. 23, 2025

      I'll start by acknowledging that is not a bug in backintime. If this is the wrong place to ask this question, please consider leaving a suggestion to where I should ask.

I've set up a Linux server to serve as a repository for remote backintime backups for several systems. I'm trying to create a report, for each system's backup, of the amount of disk space used for the latest snapshot (easy, I'll show the command below), and total disk space used for all the snapshots for that system.

To the disk space used for the latest snapshot (or any particular snapshot), there's no difficulty. For example:

/usr/bin/du -sx /pool/backup/backintime/mysystem.example.com/user/1/last_snapshot/backup/*

The problem comes when I want to see the total disk use for `mysystem.example.com`. This command gives the correct answer, as far as I can tell:

/usr/bin/du -sx /pool/backup/backintime/mysystem.example.com/

The problem is that while the answer appears to be correct, as the number of snapshots increases, the du command takes longer and longer to execute. For some of the backups with a large number of files, it takes hours for that one du command to execute.

My guess is that this has something to do with how du is handling the hard links in order to get that correct answer. Based on my fiddling around, it appears that du is visiting every snapshot directory and going through all of files it finds, even if there are only a few files that differ between snapshots.

The result appears to be that if that first `du` command takes ten minutes due to the number of files in the snapshot, the second `du` command takes ten minutes * the number of snapshots.

Is my guess correct? Or is this due to something else? Is there any work-around?

The purpose of this report I'm creating is to understand how much actual disk space is being used over time for backintime backups. If I found that new snapshots were taking up a lot of space, it would mean the system's user was refreshing a lot of large files in-between snapshots. I'd want to re-evaluate the frequency of their backups or how long backintime retained the snapshots.

AlmaLinux 9.5
backintime 1.3.2

I obtained backtime via the EPEL repository:
dnf install backintime-qt

Disk has:

# df -h /pool
Filesystem             Size  Used Avail Use% Mounted on
/dev/mapper/POOL-pool   17T  5.1T   11T  33% /pool

# df -hi /pool
Filesystem            Inodes IUsed IFree IUse% Mounted on
/dev/mapper/POOL-pool   262M   58M  205M   22% /pool

-- 
-- Bill

du command slow over multiple snapshots

William Seligman

Hakan Bayındır

o.stebliuk＠youarelaunched.com

William Seligman

Hakan Bayındır

o.stebliuk＠youarelaunched.com

William Seligman

tags

participants (3)