[Python-checkins] gh-83245: Raise BadZipFile instead of ValueError when reading a corrupt ZIP file (GH-32291)

serhiy-storchaka webhook-mailer at python.org
Mon May 23 13:59:31 EDT 2022


https://github.com/python/cpython/commit/202ed2506c84cd98e9e35621b5b2929ceb717864
commit: 202ed2506c84cd98e9e35621b5b2929ceb717864
branch: main
author: Sam Ezeh <sam.z.ezeh at gmail.com>
committer: serhiy-storchaka <storchaka at gmail.com>
date: 2022-05-23T20:59:21+03:00
summary:

gh-83245: Raise BadZipFile instead of ValueError when reading a corrupt ZIP file (GH-32291)

Co-authored-by: Serhiy Storchaka <storchaka at gmail.com>

files:
A Misc/NEWS.d/next/Library/2022-04-03-19-40-09.bpo-39064.76PbIz.rst
M Lib/test/test_zipfile.py
M Lib/zipfile.py

diff --git a/Lib/test/test_zipfile.py b/Lib/test/test_zipfile.py
index 848bf4f76d453..f4c11d88c8a09 100644
--- a/Lib/test/test_zipfile.py
+++ b/Lib/test/test_zipfile.py
@@ -1740,6 +1740,17 @@ def test_empty_file_raises_BadZipFile(self):
             fp.write("short file")
         self.assertRaises(zipfile.BadZipFile, zipfile.ZipFile, TESTFN)
 
+    def test_negative_central_directory_offset_raises_BadZipFile(self):
+        # Zip file containing an empty EOCD record
+        buffer = bytearray(b'PK\x05\x06' + b'\0'*18)
+
+        # Set the size of the central directory bytes to become 1,
+        # causing the central directory offset to become negative
+        for dirsize in 1, 2**32-1:
+            buffer[12:16] = struct.pack('<L', dirsize)
+            f = io.BytesIO(buffer)
+            self.assertRaises(zipfile.BadZipFile, zipfile.ZipFile, f)
+
     def test_closed_zip_raises_ValueError(self):
         """Verify that testzip() doesn't swallow inappropriate exceptions."""
         data = io.BytesIO()
diff --git a/Lib/zipfile.py b/Lib/zipfile.py
index 9f4437526c91f..fc6ca65e5ed1e 100644
--- a/Lib/zipfile.py
+++ b/Lib/zipfile.py
@@ -1381,6 +1381,8 @@ def _RealGetContents(self):
             print("given, inferred, offset", offset_cd, inferred, concat)
         # self.start_dir:  Position of start of central directory
         self.start_dir = offset_cd + concat
+        if self.start_dir < 0:
+            raise BadZipFile("Bad offset for central directory")
         fp.seek(self.start_dir, 0)
         data = fp.read(size_cd)
         fp = io.BytesIO(data)
diff --git a/Misc/NEWS.d/next/Library/2022-04-03-19-40-09.bpo-39064.76PbIz.rst b/Misc/NEWS.d/next/Library/2022-04-03-19-40-09.bpo-39064.76PbIz.rst
new file mode 100644
index 0000000000000..34d31527e332d
--- /dev/null
+++ b/Misc/NEWS.d/next/Library/2022-04-03-19-40-09.bpo-39064.76PbIz.rst
@@ -0,0 +1,2 @@
+:class:`zipfile.ZipFile` now raises :exc:`zipfile.BadZipFile` instead of ``ValueError`` when reading a
+corrupt zip file in which the central directory offset is negative.



More information about the Python-checkins mailing list