[Python-Dev] Drop support for ones' complement machines?

Tue Dec 1 16:57:30 CET 2009

On Tue, Dec 1, 2009 at 3:32 PM, "Martin v. Löwis" <martin at v.loewis.de> wrote:
>> No, the original question really was the question that I meant to ask.  :)
>
> Ok. Then the reference to issue 7406 is really confusing, as this is
> about undefined behavior - why does the answer to your question affect
> the resolution of this issue?

Apologies for the lack of clarity.

So in issue 7406 I'm complaining (amongst other things) that int_add
uses the expression 'x+y', where x and y are longs, and expects this
expression to wrap modulo 2**n on overflow.  As you say, this is
undefined behaviour.  One obvious way to fix it is to write

  (long)((unsigned long)x + (unsigned long)y)

instead.

But *here's* the problem:  this still isn't a portable solution!
It no longer depends on undefined behaviour, but it *does*
depend on implementation-defined behaviour:  namely, what happens
when an unsigned long that's greater than LONG_MAX is converted to
long.  (See C99 6.3.1.3., paragraph 3:  "Otherwise, the new type is
signed and the value cannot be represented in it; either the result is
implementation-defined or an implementation-defined signal is raised.")

It's this implementation-defined behaviour that I'd like to assume.

> I think gcc makes promises here beyond resolving implementation-defined
> behavior. For bitshift operators, C99 says (6.5.7)
> [...]

Yes, I'm very well aware of the issues with shifting signed integers;  I'm
not proposing making any assumptions here.

> So I'm still opposed to codifying your assumptions if that would mean
> that CPython could now start relying on left-shift to behave in a
> certain way. For right-shift, your assumptions won't help for
> speculation about the result: I think it's realistic that some
> implementations sign-extend, yet others perform the shift unsigned
> (i.e. zero-extend).
>
> I'd rather prefer to explicitly list what CPython assumes about the
> outcome of specific operations. If this is just about &, |, ^, and ~,
> then its fine with me.

I'm not even interested in going this far:  I only want to make explicit
the three assumptions I specified in my original post:

 - signed integers are represented using two's complement

 - for signed integers the bit pattern 100....000 is not a trap representation

 - conversion from an unsigned type to a signed type wraps modulo
   2**(width of unsigned type).

(Though I think these assumptions do in fact completely determine
the behaviour of &, |, ^, ~.)

As far as I know these are almost universally satisfied for current
C implementations, and there's little reason not to assume them,
but I didn't want to document and use these assumptions without
consulting python-dev first.

Mark