creating size-limited tar files
andrea.crotti.0 at gmail.com
Wed Nov 7 22:52:18 CET 2012
On 11/07/2012 08:32 PM, Roy Smith wrote:
> In article <509ab0fa$0$6636$9b4e6d93 at newsspool2.arcor-online.net>,
> Alexander Blinne <news at blinne.net> wrote:
>> I don't know the best way to find the current size, I only have a
>> general remark.
>> This solution is not so good if you have to impose a hard limit on the
>> resulting file size. You could end up having a tar file of size "limit +
>> size of biggest file - 1 + overhead" in the worst case if the tar is at
>> limit - 1 and the next file is the biggest file. Of course that may be
>> acceptable in many cases or it may be acceptable to do something about
>> it by adjusting the limit.
> If you truly have a hard limit, one possible solution would be to use
> tell() to checkpoint the growing archive after each addition. If adding
> a new file unexpectedly causes you exceed your hard limit, you can
> seek() back to the previous spot and truncate the file there.
> Whether this is worth the effort is an exercise left for the reader.
So I'm not sure if it's an hard limit or not, but I'll check tomorrow.
But in general for the size I could also take the size of the files and
simply estimate the size of all of them,
pushing as many as they should fit in a tarfile.
With compression I might get a much smaller file maybe, but it would be
But the other problem is that at the moment the people that get our
chunks reassemble the file with a simple:
cat file1.tar.gz file2.tar.gz > file.tar.gz
which I suppose is not going to work if I create 2 different tar files,
since it would recreate the header in all of the them, right?
So or I give also a script to reassemble everything or I have to split
in a more "brutal" way..
Maybe after all doing the final split was not too bad, I'll first check
if it's actually more expensive for the filesystem (which is very very slow)
or it's not a big deal...
More information about the Python-list