ZipFile - file adding API incomplete?

Dave Angel davea at ieee.org
Tue Nov 17 09:28:01 EST 2009


Diez B. Roggisch wrote:
> <div class="moz-text-flowed" style="font-family: -moz-fixed">Glenn 
> Maynard schrieb:
>> I want to do something fairly simple: read files from one ZIP and add
>> them to another, so I can remove and replace files.  This led me to a
>> couple things that seem to be missing from the API.
>>
>> <snip>
>>
>> The correct approach is to copy the data directly, so it's not
>> recompressed.  This would need two new API calls: rawopen(), acting
>> like open() but returning a direct file slice and not decompressing
>> data; and rawwrite(zinfo, file), to pass in pre-compressed data, where
>> the compression method in zinfo matches the compression type used.
>>
>> I was surprised that I couldn't find the former.  The latter is an
>> advanced one, important for implementing any tool that modifies large
>> ZIPs.  Short-term, at least, I'll probably implement these externally.
>
> <snip>
>
> And regarding your second idea: can that really work? Intuitively, I 
> would have thought that compression is adaptive, and based on prior 
> additions to the file. I might be wrong with this though.
>
>
I'm pretty sure that the ZIP format uses independent compression for 
each contained file (member).  You can add and remove members from an 
existing ZIP, and use several different compression methods within the 
same file.  So the adaptive tables start over for each new member.

What isn't so convenient is that the sizes are apparently at the end.  
So if you're trying to unzip "over the wire" you can't readily do it 
without somehow seeking to the end.  That same feature is a good thing 
when it comes to spanning zip files across multiple disks.

The zip file format is documented on the net, but I haven't read the 
spec in at least 15 years.

DaveA




More information about the Python-list mailing list