[Python-3000] bug in modulus?

Thu May 4 07:24:41 CEST 2006

[Aahz]
> Well, yes, but Uncle Timmy is jumping up and down and screaming for a
> change.

That's a pretty bizarre characterization, if I say so myself.  But
since I've already been accused, I may as well live up to it ;-):

> Granted, I think we should show respect for our numerical
> elders, but I think Michael also has a point about the importance of
> making floats and ints behave the same, especially given that division
> will auto-promote floats from ints.

Look at the example I was responding to:

>>> -1e-50 % 2.0
2.0

The instant "feel good" response is "oh, no real problem, make it
return 0.0 instead".  But that would break a different invariant (and
in fact a more fundamental one) people legitimately expect from
integers:

    a = (a // b)*b + a%b

Because -1e50 / 2.0 is very visibly less than 0.0 (this isn't an
endcase, or even special):

>>> -1e-050 / 2.0
-5e-051

the floor has to be -1.0:

>>> -1e-050 // 2.0
-1.0

Plug that into the identity above to get

    -1e-050 = -1.0*2.0 + a%b

or

    a%b = 2.0 - 1e-050

Python does the best it can to meet that, returning the closest HW
float to the infinite precision result of 2.0-1e-50.  0.0 is nowhere
close to that.  So, believe it or not, we return 2.0 here because
we're trying our best to _meet_ expectations derived from the
well-behaved unbounded-precision integer mod.

It's not possible to meet _all_ expectations derived from integers
simultaneously using HW floats with this defintion of %.  In this
case, we violate the expectation that abs(a%b) < abs(b).  If we
returned 0.0, we'd violate the more fundamental identity above.

divmod gets bizarre with floats too (divmod makes perfect sense for
integers, but not for floats), like this on Windows:

>>> divmod(1e300, 1e-300)
(1.#INF, 4.891554850853602e-301)

C doesn't have these problems because it doesn't define divmod at all,
and defines (well, C99 does) % for both integers and floats in a way
that makes most sense for floats.   In C99,

    -1e-50 % 2.0 == -1e-50

exactly, and

    ((int)(a/b))*b + a%b == a

exactly in this case.  The "exactly" isn't a coincidence:  using C's
definition for floating %, the machine value returned for a%b is
exactly equal to the result of computing

    a - ((int)(a/b))*b

to infinite precision.   At base, that's why C's definition makes most
sense for floats:  it doesn't lose information when applied to floats,
and therefore also doesn't introduce surprises _due_ to information
loss.  Python's integer-derived meaning for % is less accurate for
floats, usually suffering rounding errors.  This is always harmful
when it occurs, although it's not usually so _obviously_ surprising as
rounding the infinite-precision result of 2.0-1e-50 to the machine
value 2.0.

So we end up with a floating % that makes little sense for floats used
_as_ floats, and that can't meet all integer-derived expectations
anyway.  What's the point?  "Think of the children!"?  They'd be
better off with rationals than floats anyway, and there are no
problems using the integer definition of % for rationals too.

I'd be happiest if P3K floats didn't support __mod__ or __divmod__ at
all.  Floating mod is so rare it doesn't need syntactic support, and
the try-to-be-like-integer __mod__ and __divmod__ floats support now
can deliver surprises to all users incautious enough to use them.

BTW,

>>> decimal.Decimal("-1e-50") % decimal.Decimal("2.0")
Decimal("-1E-50")
>>> any(decimal.getcontext().flags.values())
False

The point of the last line is that the Rounding and Inexact flags are
still clear:  like C's, the decimal module's notion of % is also an
exact computation despite that it's working with floats.  This is
impossible for floats using the "a%b has the same sign as b"
definition.

The point of the first line is that a sane meaning for floating % has
already snuck into Python ;-)