Python-list Digest, Vol 75, Issue 226

r0g aioe.org at technicalbloke.com
Wed Dec 23 15:47:07 EST 2009


Gabriel Genellina wrote:
> En Tue, 22 Dec 2009 16:30:58 -0300, r0g <aioe.org at technicalbloke.com>
> escribió:
>> Gabriel Genellina wrote:
>>> En Mon, 21 Dec 2009 16:30:13 -0300, Pulkit Agrawal
>>> <thatguypulkit at gmail.com> escribió:
>>>
>>>> I am writing a script wherein I need to merge files into existing
>>>> tar.gz
>>>> files. Currently, I am using tarfile module. I extract the tar.gz to a
>>>> tempdir and copy the new file there and re-compress all the files back
>>>> into
>>>> a tar.gz.  Is there a better way to do it?
>>>
>>> Since noone answered yet: no, I don't think you can avoid to decompress
>>> and recompress those files.
>>
>> Erm, I always thought it was OK to simply cat gzipped files together...
> 
> Maybe, but still I don't think this could help the OP. As I understand
> the problem, originally there were e.g.: file1, file2, file3; they were
> tarred into file123.tar and gzipped into file123.tar.gz. And now file2
> must be replaced by a newer version. It should go into the internal .tar
> file, replacing the old one; I don't see how to do that without
> decompressing it. (Ok, once the tar is decompressed one might replace
> the old file with the newer one in-place using the tar command, but this
> cannot be done with the tarfile Python module)
> 


Oh I didn't see the original posting! I agree, if files within the
tarball need to be replaced (as opposed to more new files added) that
can only be done on uncompressed tar archives.

If performance is a issue and the files aren't too gargantuan it may be
best to de(and re)compress the gzips to memory/ramdisk rather than
writing it all out to disk. If the files are too large for memory but
the OP can afford a bit more storage then storing plain uncompressed
tarballs would allow use of tar's delete, replace and append
functionality. If that would take up too much space maybe they could use
plain tar on a block compressed filesystem instead.

Cheers,

Roger.



More information about the Python-list mailing list