[Patches] [ python-Patches-576327 ] zipfile when sizeof(long) == 8

noreply@sourceforge.net noreply@sourceforge.net
Tue, 02 Jul 2002 18:58:09 -0700


Patches item #576327, was opened at 2002-07-02 07:11
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576327&group_id=5470

Category: Library (Lib)
Group: Python 2.2.x
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: The Written Word (Albert Chin) (tww-china)
Assigned to: Tim Peters (tim_one)
Summary: zipfile when sizeof(long) == 8

Initial Comment:
This bug also applies to Python 2.0.x and 2.1.x (most
likely every version).

When sizeof (long) == 8, like on Tru64 UNIX,
zipfile.testzip () fails due to a CRC error. The
problem is that in Lib/zipfile.py:
  crc = binascii.crc32(bytes)
converts the 32-bit binascii.crc32() return value to a
64-bit value (crc). We need to force crc to remain a
32-bit value. Attached is a patch though maybe someone
else can think of something better.


----------------------------------------------------------------------

>Comment By: Tim Peters (tim_one)
Date: 2002-07-02 21:58

Message:
Logged In: YES 
user_id=31435

Thanks for your help, Albert!  While I started my ill-spent 
computer career on 64-bit Crays, you're the only 64-bit 
platform I have anymore <wink>.

This report is Closed.

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 21:30

Message:
Logged In: YES 
user_id=119770

Ok, Modules/binascii.c v2.36 works good!

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 18:41

Message:
Logged In: YES 
user_id=119770

Ok, hang on. I'm doing a clean build to make sure I wasn't
using anything from an old install.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-07-02 18:25

Message:
Logged In: YES 
user_id=31435

Please try again.  New patch tries to force the entry 
conditions in crc32(), as well as the return value.

Modules/binascii.c; new revision: 2.36

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-07-02 17:54

Message:
Logged In: YES 
user_id=31435

So what did it get, and what did it expect?  I.e., same stuff all 
over again.

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 17:41

Message:
Logged In: YES 
user_id=119770

Ok, well, testing worked fine on the test file I created but
running against Lib/test/test_zipfile.py gives:
Traceback (most recent call last):
  File "test_zipfile.py", line 35, in ?
    zipTest(file, zipfile.ZIP_STORED, writtenData)
  File "test_zipfile.py", line 16, in zipTest
    readData2 = zip.read(srcname)
  File "/opt/TWWfsw/python221/lib/python2.2/zipfile.py",
line 351, in read
    raise BadZipfile, "Bad CRC-32 for file %s" % name
zipfile.BadZipfile: Bad CRC-32 for file junk9630.tmp

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 17:01

Message:
Logged In: YES 
user_id=119770

Tested the new Modules/binascii.c against 2.2.1 on Tru64
4.0D, 5.1, and HP-UX 11i and it works. Thanks!

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-07-02 16:20

Message:
Logged In: YES 
user_id=31435

No, I don't have access to a 64-bit box.

Do you have access to CVS Python?  If so, please try again.  
I patched it to try to make binascii.crc32() return the same 
result across platforms.

Modules/binascii.c; new revision: 2.35

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 11:47

Message:
Logged In: YES 
user_id=119770

>From zipfile.py:
  ...
  structCentralDir = "<4s4B4H3l5H2l"
  ...
  def _RealGetContents(self):
    ...
            centdir = fp.read(46)
            total = total + 46
            if centdir[0:4] != stringCentralDir:
                raise BadZipfile, "Bad magic number for
central directory"
            centdir = struct.unpack(structCentralDir, centdir)

When a zipfile is created, the CRC is written with:
  def write(self, filename, arcname=None, compress_type=None):
    ...
        self.fp.write(struct.pack("<lll", zinfo.CRC,
zinfo.compress_size,
              zinfo.file_size))

Changing the "3l" to "3L" or "3I" in structCentralDir is
another workaround but as we wrote with "l", we should also
read with "l" (maybe this is the real problem).

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 11:42

Message:
Logged In: YES 
user_id=119770

Bug #453208 indicates a similar problem.

----------------------------------------------------------------------

Comment By: The Written Word (Albert Chin) (tww-china)
Date: 2002-07-02 11:06

Message:
Logged In: YES 
user_id=119770

Do you have access to a machine where sizeof (long) == 8?
Here's what I'm getting:

$ uname -a
OSF1 duh V4.0 878 alpha
$ python
>>> import zipfile
>>> zip = zipfile.ZipFile ('/tmp/a.zip', 'w')
>>> zip.write ('/vmuniz', 'vmunix')
>>> zip.close ()
>>> zip = zipfile.ZipFile ('/tmp/a.zip', 'r')
>>> zip.testzip()
2226205591 -2068761705

I addes some debugging statements to zipfile.read(). The
first number is the output of binascii.crc32() while the
second is the output of zinfo.CRC (the CRC value in the
zipfile header for 'vmuniz' in /tmp/a.zip).

Would binascii.crc32() *ever* return a negative number or
does it return an unsigned type? Looking at the source to
Modules/binascii.c, crc is an unsigned long but the value
returned is signed long.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-07-02 10:44

Message:
Logged In: YES 
user_id=31435

I believe you're having a problem, but I can't tell what it is.  
Exactly how does zipfile.testzip() fail?  What did it get and 
what did it expect?

It's not possible to "force crc to remain a 32-bit value" on a 64-
bit box with sizeof(long)==8 -- Python doesn't have any 32-bit 
type on such a box.  So it seems most likely that some 32-
bit value either is or isn't getting sign-extended when this 
fails, but I can't tell from the report which of the disagreeing 
values that may be, or which it *should* be.

IOW, we need more info about how this fails.  If you're 
hacking the result of binascii.crc32() and calling that "a fix", 
chances seem high that the correct fix lies in changing what 
crc32() returns.  But not yet enough info here to say.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=576327&group_id=5470