[Python-Dev] Drop support for ones' complement machines?
dickinsm at gmail.com
Tue Dec 1 16:57:30 CET 2009
On Tue, Dec 1, 2009 at 3:32 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> No, the original question really was the question that I meant to ask. :)
> Ok. Then the reference to issue 7406 is really confusing, as this is
> about undefined behavior - why does the answer to your question affect
> the resolution of this issue?
Apologies for the lack of clarity.
So in issue 7406 I'm complaining (amongst other things) that int_add
uses the expression 'x+y', where x and y are longs, and expects this
expression to wrap modulo 2**n on overflow. As you say, this is
undefined behaviour. One obvious way to fix it is to write
(long)((unsigned long)x + (unsigned long)y)
But *here's* the problem: this still isn't a portable solution!
It no longer depends on undefined behaviour, but it *does*
depend on implementation-defined behaviour: namely, what happens
when an unsigned long that's greater than LONG_MAX is converted to
long. (See C99 126.96.36.199., paragraph 3: "Otherwise, the new type is
signed and the value cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.")
It's this implementation-defined behaviour that I'd like to assume.
> I think gcc makes promises here beyond resolving implementation-defined
> behavior. For bitshift operators, C99 says (6.5.7)
Yes, I'm very well aware of the issues with shifting signed integers; I'm
not proposing making any assumptions here.
> So I'm still opposed to codifying your assumptions if that would mean
> that CPython could now start relying on left-shift to behave in a
> certain way. For right-shift, your assumptions won't help for
> speculation about the result: I think it's realistic that some
> implementations sign-extend, yet others perform the shift unsigned
> (i.e. zero-extend).
> I'd rather prefer to explicitly list what CPython assumes about the
> outcome of specific operations. If this is just about &, |, ^, and ~,
> then its fine with me.
I'm not even interested in going this far: I only want to make explicit
the three assumptions I specified in my original post:
- signed integers are represented using two's complement
- for signed integers the bit pattern 100....000 is not a trap representation
- conversion from an unsigned type to a signed type wraps modulo
2**(width of unsigned type).
(Though I think these assumptions do in fact completely determine
the behaviour of &, |, ^, ~.)
As far as I know these are almost universally satisfied for current
C implementations, and there's little reason not to assume them,
but I didn't want to document and use these assumptions without
consulting python-dev first.
More information about the Python-Dev