[Python-Dev] zipfile still has 2GB boundary bug
bob at redivi.com
Mon Apr 25 04:32:35 CEST 2005
The "2GB bug" that was supposed to be fixed in
<http://python.org/sf/679953> was not actually fixed. The zipinfo
offsets in the structures are still signed longs, so the fix allows you
to write one file that extends past the 2G boundary, but if any extend
past that point you are screwed.
I have opened a new bug and patch that should fix this issue
<http://python.org/sf/1189216>. This is a backport candidate to 2.4.2
and 2.3.6 (if that ever happens).
On a related note, if anyone else has a bunch of really big and
ostensibly broken zip archives created by dumb versions of the zipfile
module, I have written a script that can rebuild the central directory
in-place. Ping me off-list if you're interested and I'll clean it up.
Someone should think about rewriting the zipfile module to be less
hideous, include a repair feature, and be up to date with the latest
Additionally, it'd also be useful if someone were to include support
for Apple's "extensions" to the zip format (the __MACOSX folder and its
contents) that show up when BOM (private framework) is used to create
archives (i.e. Finder in Mac OS X 10.3+). I'm not sure if these are
documented anywhere, but I can help with reverse engineering if someone
is interested in writing the code.
On that note, Mac OS X 10.4 (Tiger) is supposed to have new APIs (or
changes to existing APIs?) to facilitate resource fork preservation,
ACLs, and Spotlight hooks in tar, cp, mv, etc. Someone should spend
some time looking at the Darwin 8 sources for these tools (when they're
publicly available in the next few weeks) to see what would need to be
done in Python to support them in the standard library (the os,
tarfile, etc. modules).
More information about the Python-Dev