[Python-bugs-list] [ python-Bugs-679953 ] zipfile.py - pack filesize as unsigned allows files > 2 gig
SourceForge.net
noreply@sourceforge.net
Mon, 03 Feb 2003 17:54:30 -0800
Bugs item #679953, was opened at 2003-02-03 20:54
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=679953&group_id=5470
Category: Python Library
Group: Python 2.2.2
Status: Open
Resolution: None
Priority: 5
Submitted By: Jimmy Burgett (laurelboa)
Assigned to: Nobody/Anonymous (nobody)
Summary: zipfile.py - pack filesize as unsigned allows files > 2 gig
Initial Comment:
Python 2.2.2
Windows XP (all serice packs installed)
Windows 2000 (all service packs installed)
The filesize and compressed file size numbers in the zip
header need to "struct.packed" as unsigned ints, not
signed ints. This allows zipfile.py to compress files
greater than 2 gigabytes in size. Currently, an attempt
to compress such a large file gives you this error:
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Python22\lib\zipfile.py", line 426, in write
zinfo.file_size))
OverflowError: long int too large to convert to int
where the line in question is:
self.fp.write(struct.pack("<lll", zinfo.CRC,
zinfo.compress_size,
zinfo.file_size))
I believe that the four changes below are all that is
needed. This is from version 2.2.2, but zipfile.py in 2.3a1
still had the file size packed/unpacked as a signed
integer.
I have not tested whether the ziplib routines can seek
past the 2 gig boundary in order to extract a file whose
beginning is past the 2 gig boundary. My application
requires compressing very large files one at a time and
zipfile.py lets me use either WinZip or the built-in
Windows "unzip" function for extraction.
These changes allow that use.
-------------- Change Line #28
# Here are some struct module formats for reading
headers
structEndArchive = "<4s4H2lH" # 9 items, end of
archive, 22 bytes
stringEndArchive = "PK\005\006" # magic number for
end of archive record
structCentralDir = "<4s4B4H3l5H2l"# 19 items, central
directory, 46 bytes
to
structCentralDir = "<4s4B4HlLL5H2L"# 19 items, central
directory, 46 bytes
--------------- change line #306
def printdir(self):
"""Print a table of contents for the zip file."""
print "%-46s %19s %12s" % ("File
Name", "Modified ", "Size")
for zinfo in self.filelist:
date = "%d-%02d-%02d %02d:%02d:%02d" %
zinfo.date_time
print "%-46s %s %12d" % (zinfo.filename, date,
zinfo.file_size)
to
print "%-46s %s %12u" % (zinfo.filename, date,
zinfo.file_size)
---------------- change line #425
# Seek backwards and write CRC and file sizes
position = self.fp.tell() # Preserve current
position in file
self.fp.seek(zinfo.header_offset + 14, 0)
self.fp.write(struct.pack("<lll", zinfo.CRC,
zinfo.compress_size,
zinfo.file_size))
to
self.fp.write(struct.pack("<lLL", zinfo.CRC,
zinfo.compress_size,
zinfo.file_size))
---------------- change line #450
if zinfo.flag_bits & 0x08:
# Write CRC and file sizes after the file data
self.fp.write(struct.pack("<lll", zinfo.CRC,
zinfo.compress_size,
zinfo.file_size))
to
self.fp.write(struct.pack("<lLL", zinfo.CRC,
zinfo.compress_size,
zinfo.file_size))
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=679953&group_id=5470