[Python-Dev] test_gzip/test_tarfile failure om AMD64

Mon May 29 21:27:57 CEST 2006

[Guido]
> ...
> It's really only a practical concern for 32-bit values on 32-bit
> machines, where reasonable people can disagree over whether 0xffffffff
> is -1 or 4294967295.

Then maybe we should only let that one slide <0.5 wink>.

...

[Tim]
>> So, in all, I'm 95% sure 2.4's behavior is buggy, but 50% unsure that
>> we need to warn about it before repairing it.  Since you (Thomas) want
>> warnings, and in theory it only affects the lightly-used "standard"
>> modes, I do lean in favor of leaving the standard modes that _are_
>> broken (as above, not all are) broken in 2.5 but warning that this
>> will change in 2.6.

> I'm not sure what we gain by leaving other std modules depending on
> struct's brokenness broken. But I may be misinterpreting which modules
> you're referring to.

I think you're just reading "module" where I wrote "mode".  "Standard
mode" is struct-module terminology, as in

    ">b"
    "!b"
    "<b"

are standard modes but

    "b"

is not a standard mode (it's "native mode").  But I got it backwards
-- or maybe not ;-)  It's confusing because it's so inconsistent (this
under 2.4.3 on 32-bit Windows):

>>> struct.pack(">B", -32) # std mode doesn't complain
'\xe0'
>>> struct.pack("B", -32)  # native mode does
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
struct.error: ubyte format requires 0<=number<=255

>>> struct.pack(">b", 255)  # std mode doesn't complain
'\xff'
>>> struct.pack("b", 255) # native mode does
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
struct.error: byte format requires -128<=number<=127

On the other hand, as I noted last time, some standard modes _do_
range-check -- but not correctly on some 64-bit boxes -- and not
consistently across positive and negative out-of-range values, or
across input types.  Like:

>>> struct.pack(">i", 2**32-1)  # std and native modes complain
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int
>>> struct.pack("i", 2**32-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: long int too large to convert to int

>>> struct.pack("I", -1)  # neither std nor native modes complain
'\xff\xff\xff\xff'
>>> struct.pack(">I", -1)
'\xff\xff\xff\xff'

>>> struct.pack("I", -1L)  # but both complain if the input is long
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long
>>> struct.pack(">I", -1L)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: can't convert negative value to unsigned long

In short, there's no way to explain what struct checks for in 2.4.3
short of drawing up an exhaustive table of standard-vs-native mode,
format code, "which direction" a value may be out of range, and
whether the value is given as a Python int or a long.

At the sprint, I encouraged Bob to do complete range-checking.  That's
explainable.  If we have to back off from that, then since the new
code is consistent, I'm sure any warts he puts back in will be clearly
look like warts ;-)

> I think we should do as Thomas proposes: plan to make it an error in
> 2.6 (or 2.7 if there's a big outcry, which I don't expect) and accept
> it with a warning in 2.5.

That's what I arrived at, although 2.4.3's checking behavior is
actually so inconsistent that "it" needs some defining (what exactly
are we trying to still accept?  e.g., that -1 doesn't trigger "I"
complaints but that -1L does above?  that one's surely a bug).   To be
clear, Thomas proposed "accepting it" (whatever that means) until 3.0.