[Python-3000] bug in modulus?
Tim Peters
tim.peters at gmail.com
Thu May 4 07:24:41 CEST 2006
[Aahz]
> Well, yes, but Uncle Timmy is jumping up and down and screaming for a
> change.
That's a pretty bizarre characterization, if I say so myself. But
since I've already been accused, I may as well live up to it ;-):
> Granted, I think we should show respect for our numerical
> elders, but I think Michael also has a point about the importance of
> making floats and ints behave the same, especially given that division
> will auto-promote floats from ints.
Look at the example I was responding to:
>>> -1e-50 % 2.0
2.0
The instant "feel good" response is "oh, no real problem, make it
return 0.0 instead". But that would break a different invariant (and
in fact a more fundamental one) people legitimately expect from
integers:
a = (a // b)*b + a%b
Because -1e50 / 2.0 is very visibly less than 0.0 (this isn't an
endcase, or even special):
>>> -1e-050 / 2.0
-5e-051
the floor has to be -1.0:
>>> -1e-050 // 2.0
-1.0
Plug that into the identity above to get
-1e-050 = -1.0*2.0 + a%b
or
a%b = 2.0 - 1e-050
Python does the best it can to meet that, returning the closest HW
float to the infinite precision result of 2.0-1e-50. 0.0 is nowhere
close to that. So, believe it or not, we return 2.0 here because
we're trying our best to _meet_ expectations derived from the
well-behaved unbounded-precision integer mod.
It's not possible to meet _all_ expectations derived from integers
simultaneously using HW floats with this defintion of %. In this
case, we violate the expectation that abs(a%b) < abs(b). If we
returned 0.0, we'd violate the more fundamental identity above.
divmod gets bizarre with floats too (divmod makes perfect sense for
integers, but not for floats), like this on Windows:
>>> divmod(1e300, 1e-300)
(1.#INF, 4.891554850853602e-301)
C doesn't have these problems because it doesn't define divmod at all,
and defines (well, C99 does) % for both integers and floats in a way
that makes most sense for floats. In C99,
-1e-50 % 2.0 == -1e-50
exactly, and
((int)(a/b))*b + a%b == a
exactly in this case. The "exactly" isn't a coincidence: using C's
definition for floating %, the machine value returned for a%b is
exactly equal to the result of computing
a - ((int)(a/b))*b
to infinite precision. At base, that's why C's definition makes most
sense for floats: it doesn't lose information when applied to floats,
and therefore also doesn't introduce surprises _due_ to information
loss. Python's integer-derived meaning for % is less accurate for
floats, usually suffering rounding errors. This is always harmful
when it occurs, although it's not usually so _obviously_ surprising as
rounding the infinite-precision result of 2.0-1e-50 to the machine
value 2.0.
So we end up with a floating % that makes little sense for floats used
_as_ floats, and that can't meet all integer-derived expectations
anyway. What's the point? "Think of the children!"? They'd be
better off with rationals than floats anyway, and there are no
problems using the integer definition of % for rationals too.
I'd be happiest if P3K floats didn't support __mod__ or __divmod__ at
all. Floating mod is so rare it doesn't need syntactic support, and
the try-to-be-like-integer __mod__ and __divmod__ floats support now
can deliver surprises to all users incautious enough to use them.
BTW,
>>> decimal.Decimal("-1e-50") % decimal.Decimal("2.0")
Decimal("-1E-50")
>>> any(decimal.getcontext().flags.values())
False
The point of the last line is that the Rounding and Inexact flags are
still clear: like C's, the decimal module's notion of % is also an
exact computation despite that it's working with floats. This is
impossible for floats using the "a%b has the same sign as b"
definition.
The point of the first line is that a sane meaning for floating % has
already snuck into Python ;-)
More information about the Python-3000
mailing list