[Borgbackup] Further development idea/proposal: Heal the damaged repository by replacing damaged blocks from other repositories created from same sources

JK qzwx2007 at gmail.com
Wed Apr 3 08:05:06 EDT 2019


Hi,

Further development idea/proposal:

So borg clears damaged blocks to zero but keeps the original hash 
calculated from the original content.

Could it be possible to "heal" the damaged repository by replacing these 
damaged blocks from other repositories created from same sources?

I keep most of my repositories on USB disks. There are several USB 
disks, usually 3 which I switch daily. The most critical data is also 
backed up to a local disk repository. In these critical cases the backup 
always runs twice, first the local disk repo and then the USB disk repo. 
Less critical repositories are only on USB disks.

So although the repositories on different USB disks and on the local 
disk are not identical, they have lots of common content because they 
all have identical source directory settings and are pruned with same 
policy.

Now imagine I find that one (or more) of these repositories are 
partially corrupted. (Most likely the damage is only in one repository 
but if there are damages also in other repositories, they are most 
likely not concerning same source files or same source file areas.)

Could it be possible to repair the damaged repository by replacing 
zeroed blocks in the damaged repository from other repository if the 
hashes in both repositories are identical? This way we could heal the 
damaged repository or atleast decrease the number of zeroed blocks even 
with another partially damaged repository.

JK



On 3.4.2019 1.46, Marcin Zajączkowski wrote:
> Thanks for your comprehensive explanations!
>
> Marcin
>
>
> On 2019-04-02 23:47, Thomas Waldmann wrote:
>>> Changing a little bit my question, would I know after a repair operation
>>> or while getting my files from the backup (also after a repair
>>> operation) that those accessed files are corrupted?
>>>
>>> Or those files would be read as any other files, just having
>>> occasionally some zeros inside?
>> If you try to extract stuff from a corrupt repo, you will get exceptions
>> like ObjectNotFound or IntegrityError, so you'll definitely notice
>> something is wrong.
>>
>> borg check --repair tries to get a repo into a consistent state.
>>
>> That doesn't mean that data which is lost can be magically brought back,
>> but it will either delete corrupt archives or replace missing/corrupt
>> content blocks in files by all-zero blocks of same size (and also it
>> will remember the correct block hashes).
>>
>> Repo objects that have invalid contents (invalid crc or invalid MAC)
>> will be removed from the repo.
>>
>> If you extract such a "zero-patched" file, borg will warn you about it.
>> borg mount will reject reading such files, except when mounting with a
>> special option.
>>
>> If you do a backup again after such a repair that reproduces objects
>> which were lost / corrupted and you run borg check --repair again
>> afterwards, borg might be able to heal some "patched" files (because it
>> notices that lost blocks are there again and it still knows the correct
>> hash of the previously missing blocks).
>
> _______________________________________________
> Borgbackup mailing list
> Borgbackup at python.org
> https://mail.python.org/mailman/listinfo/borgbackup


More information about the Borgbackup mailing list