[New-bugs-announce] [issue28494] is_zipfile false positives

Thomas Waldmann report at bugs.python.org
Thu Oct 20 17:08:49 EDT 2016


New submission from Thomas Waldmann:

zipfile.is_zipfile has false positives way too easily.

I just have seen it in practive when a MoinMoin wiki site with a lot of pdf attachments crashed with 500. This was caused by a valid PDF that just happened to contain PK\005\006 somewhere in the middle - this was enough to satisfy is_zipfile() and triggered further processing as a zipfile, which then crashed with IOError (which was not catched in our code, yet).

I have looked into zipfile code: if the usual EOCD structure (with empty comment) is not at EOF, it is suspected that there might be a non-empty comment and ~64K before EOF are searched for the PK\005\006 magic. If it is somewhere there, it is assumed that the file is a zip, without any further validity check.

Attached is a failure demo that works with at least 2.7 and 3.5.

https://en.wikipedia.org/wiki/Zip_(file_format)

----------
components: Library (Lib)
files: isz_fail.py
messages: 279084
nosy: Thomas.Waldmann
priority: normal
severity: normal
status: open
title: is_zipfile false positives
type: behavior
versions: Python 2.7, Python 3.5
Added file: http://bugs.python.org/file45162/isz_fail.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28494>
_______________________________________


More information about the New-bugs-announce mailing list