[issue10694] zipfile.py end of central directory detection not robust

Kevin Hendricks report at bugs.python.org
Mon Dec 20 15:06:36 CET 2010


Kevin Hendricks <kevin.hendricks at sympatico.ca> added the comment:

I have not looked at how other tools handle this.  They could simply ignore what comes after a valid endrecdata is found, they could strip it out (truncate it) or make it into a final comment.  I guess for simply unpacking a zip archive, all of these are equivalent (it goes unused).

But if you are copying a zip archive to another archive then ignoring it and truncating it may be safer in some sense (since you have no idea what this extra data is for and why it might be attached) but then you are not being faithful to the original but at the same time you do not want to create improper zip archives.  If you change the extra data into a final comment, then at least none of the original data is actually lost (just moved slightly in the copied zip and protected as a comment) and could be recovered if it turns out to have been useful.  With so many things using/making the zip archive format (jars, normal zips, epubs, etc) you never know what might have been left over at the end of the zip file and if it was important.

So I am not really sure how to deal with this properly.  Also I know nothing about _EndRecData64 and if it needs to somehow be handled in a different way.

So someone who is very familiar with the code should look at this and tell us what is the right thing to do and even if the approach I took is correct (it works fine for me and I have taken to including my own zipfile.py in my own projects until this gets worked out) but it might not be the right thing to do.

As for a test case, I know nothing about that but will look at test_zipfile.py.  I am a Mac OS X user/developer so all of my code is targeted to running on both Python 2.5 (Mac OS X 10.5.X) and Python 2.6 (Mac OS 10.6.X). Python 3.X and even Python 2.7 are not on my horizon and not even on my build machine (trying to get Mac OS X users to install either would be an experience in frustration). I simply looked at the source in Python 2.7 and Python 3.1.3 (from the official Python releases from python.org) to see that the problem still exists (and it does).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10694>
_______________________________________


More information about the Python-bugs-list mailing list