[Borgbackup] Deduplication of tar files - doesn't seem to be giving good performance
Sitaram Chamarty
sitaramc at gmail.com
Thu Apr 21 02:58:11 EDT 2016
On 04/21/2016 12:22 PM, William Gogan wrote:
>
>
> Sitaram Chamarty wrote:
>> On 04/21/2016 11:15 AM, William Gogan wrote:
>>> I'm trying borgbackup out, and so far it's performing really well in almost all tests.
>>>
>>> The one item where I'm seeing odd performance is for tar files. It appears not to be deduplicating except within the current archive.
>>>
>>> Background: Our VM tool kicks out a .tar file per container. It compresses (lzo) the .tar. For discussion purposes, let's pretend it's called vm.tar.lzo
>>
>> Compression changes the bytestream. You may get lucky and the changes
>> only happened to files at the end of a tar file, but that's unlikely.
>> Depending on how many files changed, the probably that something changed
>> at the beginning of the tar file is pretty high.
> Just to confirm - even though as I mention I'm piping lzop -d --to-stdout vm.tar.lzo to borg (ie: borg is not getting a compressed file, it is being piped the uncompressed .tar file), it sounds like Borg isn't capable of handling duplicate pieces inside a file.
oop; my apologies. I reacted too fast and did not realise that borg was
getting an uncompressed file.
I assume this means borg gets the file via STDIN? If so, maybe it has
something to do with STDIN being less amenable to dedup?
sorry again for my previous (useless) mail!
More information about the Borgbackup
mailing list