[borgbackup] Borg speed tuning on large files
Alex Gorbachev
ag at iss-integration.com
Sun Aug 30 23:27:29 EDT 2015
Hi Thomas,
On Sat, Aug 29, 2015 at 9:23 AM, Thomas Waldmann <tw at waldmann-edv.de> wrote:
>> Tool Parameters Data size (apparent) Repo size Hrs Ratio C Rat C
>> MB/s
>> gzip c3 2308843696 560376600 22 24% 4.1 7
>> Attic First Run default 2251760621 531964928 48 24% 4.2 3
>> Attic Next Run default 2308843696 234398336 32 10% 9.9 2
>> Borg First Run C0,19,23,21,4095 2330579192 2354907008 26 101% 1 25
>> Borg Next Run C0,19,23,21,4095 2270686256 1341393408 18 59% 1.7 21
>> Borg First Run C3,19,23,21,4095 2270686256 568351360 33 25% 4 5
>> Borg Next Run C3,19,23,21,4095 2268472600 302165632 23 13% 7.5 4
>> Borg Next Run C1,19,23,21,4095 2247244128 422037120 24 19% 5.3 5
>
> Nice to see confirmation that we are quite faster than Attic. :)
>
> Hmm, should the last line read "Borg First Run ... C1"?
Yes, I switched the [now obsolete] parameter to level 1 for a "next run"
>
> In general, to evaluate the speed, it might be easier to only do "first
> runs", because there always some specific amount of data (== all input
> data) gets processed.
But...in that case gzip beats all :).
>
> In "next run", the amount of data actually needing processing might vary
> widely, depending on how much change there is between first and next run.
Understood, though the point of dedup is to save space on
shared/unchanged data regions. In my case the data is likely not as
similar, with 59% at no compression it means we only found 41% of
"same data" whereas I know in these databases 10% of change a day is
high. So maybe I need to go chunk size hunting. For others this
will likely work in a more efficient manner.
> BTW, note for other readers: the "Parameters" column can't be given that
> way to borg, it needs to be (e.g.):
> borg create -C1 --chunker-params 19,23,21,4095 repo::archive data
>
> Or in 0.25:
> borg create -C zlib,1 --chunker-params ....
>
>> Here is a picture in case the text does not come through well:
>
> Yeah, that looked better. :)
>
> BTW, what you currently have in the C MB/s column is how many compressed
> MB/s it actually writes to storage (and if that is a limiting factor, it
> would be your target storage, not borg).
Sorry, I should have commented, C is for computed, i.e. size divided
by time. I assume storage is not an issue, as uncompressed data can
pump here at 50+ MB/s.
>
> Maybe more interesting would be how much uncompressed data it can
> process per second.
>
>> Oddly, compression setting of 1 took longer than C3.
>
> Either there is a mistake in your table or your cpu is so fast that
> higher compression saves more time by avoiding I/O than it needs for the
> better compression.
That makes sense, CPU on this box is quite powerful.
>
> With 0.25.0 you could try:
> - lz4 = superfast, but low compression
> - lzma = slow/expensive, but high compression
> - none - no compression, no overhead (this is not zlib,0 any more)
Started lz4 trials tonight, will update!
>
>> C0 shows the actual dedup capability of this data.
>
> Doesn't seem to find significant amounts of "internal" duplication
> within a "first run". Historical dedup seems to work and help, though.
>
> Does that match your expectations considering the contents of your files?
It's a big mystery, highly esoteric database (think MUMPS :) but I
know overall change is unlikely to exceed 10% of "business content"
per day. So I am not finding the right chunk size yet.
>
> In case you measure again, keep an eye on CPU load.
I see borg taking 99% of one core, load average in the 3-4 range, but
other processes are working, so this may be a bit muddled, I will
observe at idle times.
>
>> My business goal here is to get
>> the data in within a day, so about 12 hours or so.
>
> If you can partition your data set somehow into N pieces and use N
> separate repos, you could save some time by running N borgs in parallel
> (assuming your I/O isn't a bottleneck then).
>
> N ~= core count of your CPU
>
> At some time in the future, borg might be able to a similar thing by
> internal multithreading, but that is not ready for production yet.
Understood, hard to do and make safe. Thanks.
>
> There are also some other optimizations possible in the code (using
> different hashes, different crypto modes, ...) - we'll try making it
> much faster.
Much appreciated, I have the good high stress real life playground to test this.
Alex
>
> --
>
>
> GPG ID: FAF7B393
> GPG FP: 6D5B EF9A DD20 7580 5747 B70F 9F88 FB52 FAF7 B393
>
More information about the Borgbackup
mailing list