[Borgbackup] status of cache resync improvements
Dan Christensen
jdc at uwo.ca
Fri Dec 4 14:37:16 EST 2015
Thomas Waldmann <tw at waldmann-edv.de> writes:
> As it currently looks, the cache merging code will get a LOT faster in
> next release (already in github master branch) due to that fix:
>
> https://github.com/borgbackup/borg/issues/450 (this also applies to
> attic, btw.)
Ah, nice to hear! I hope to give this a try. Do you have an ETA for
the next release?
>> This repo contains 188 archives.
>
> Be careful, with borgbackup you will need a LOT of space for
> ~/.cache/borg (except if you switch off per-archive index caching, see
> the README about chunks.archive.d).
Thanks for the tip. Can you point me to this README? I may turn this
off if the other changes improve things enough.
>> Because of this, it seems to me to make sense to keep a copy of the
>> cache in the remote repo, and then copy to the local machine when borg
>> notices that the cache is out of sync. The cache would add 0.3% to the
>> repo size, in this case.
>
> We have to be careful to not disclose information about your stuff (the
> data in the repo might be encrypted, but the index stuff is not [yet?]).
Since this is information that all clients need and that is slow to
recompute, I really think that it would make sense to store it in the
repo. When a client's cache is out of date, it could either simply
download it from the server, or an rsync-like method could be used to
only download changed portions.
I understand that this might be a problem for encrypted repos, although
I imagine it would be possible to work around this. For example, the
chunk id_hash and unencrypted size could be encrypted before being
stored in the chunks cache, if they are considered sensitive. Then the
server could keep the reference counts up to date in the chunks cache.
I personally don't mind having my cache stored unencrypted on the
server, so I might make a wrapper around borg that downloads the
cache from the server before each backup, and copies it back after
each backup. To make sure I understand things fully:
cache/files should never be out of date. It contains only data local to
the client, so shouldn't need updating if another client updated the repo.
cache/chunks should contain nothing specific to the client, so it could
be shared between clients.
cache/config: what about this?
Thanks for the help! It's great to see such active development on
this project.
Dan
More information about the Borgbackup
mailing list