[Python-Dev] test_gzip/test_tarfile failure om AMD64

Thomas Wouters thomas at python.org
Mon May 29 02:34:45 CEST 2006


On 5/29/06, Bob Ippolito <bob at redivi.com> wrote:
>
>
> On May 28, 2006, at 4:31 AM, Thomas Wouters wrote:
> >
> > I'm seeing a dubious failure of test_gzip and test_tarfile on my
> > AMD64 machine. It's triggered by the recent struct changes, but I'd
> > say it's probably caused by a bug/misfeature in zlibmodule:
> > zlib.crc32 is the result of a zlib 'crc32' functioncall, which
> > returns an unsigned long. zlib.crc32 turns that unsigned long into
> > a (signed) Python int, which means a number beyond 1<<31 goes
> > negative on 32-bit systems and other systems with 32-bit longs, but
> > stays positive on systems with 64-bit longs:
> >
> > (32-bit)
> > >>> zlib.crc32("foobabazr")
> > -271938108
> >
> > (64-bit)
> > >>> zlib.crc32("foobabazr")
> > 4023029188
> >
> > The old structmodule coped with that:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > '\xc4\x8d\xca\xef'
> >
> > The new one does not:
> > >>> struct.pack("<l", -271938108)
> > '\xc4\x8d\xca\xef'
> > >>> struct.pack("<l", 4023029188)
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in <module>
> >   File "Lib/struct.py", line 63, in pack
> >     return o.pack(*args)
> > struct.error: 'l' format requires -2147483647 <= number <= 2147483647
> >
> > The structmodule should be fixed (and a test added ;) but I'm also
> > wondering if zlib shouldn't be fixed. Now, I'm AMD64-centric, so my
> > suggested fix would be to change the PyInt_FromLong() call to
> > PyLong_FromUnsignedLong(), making zlib always return positive
> > numbers -- it might break some code on 32-bit platforms, but that
> > code is already broken on 64-bit platforms. But I guess I'm okay
> > with the long being changed into an actual 32-bit signed number on
> > 64-bit platforms, too.
>
> The struct module isn't what's broken here. All of the struct types
> have always had well defined bit sizes and alignment if you
> explicitly specify an endian, >I and >L are 32-bits everywhere, and
> >Q is supported on platforms that don't have long long. The only
> thing that's changed is that it actually checks for errors
> consistently now.


Yes. And that breaks things. I'm certain the behaviour is used in real-world
code (and I don't mean just the gzip module.) It has always, as far as I can
remember, accepted 'unsigned' values for the signed versions of ints, longs
and long-longs (but not chars or shorts.) I agree that that's wrong, but I
don't think changing struct to do the right thing should be done in 2.5. I
don't even think it should be done in 2.6 -- although 3.0 is fine.

-- 
Thomas Wouters <thomas at python.org>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060529/3bc0bb68/attachment-0001.html 


More information about the Python-Dev mailing list