[Python-Dev] Floor division

Fri Jan 26 12:04:02 CET 2007

[Tim Peters]
>> ...
>> [it would be possible for float.__divmod__ to return the exact
>> quotient], but who would have a (sane) use for a possibly 2000-bit
>> quotient?

[Nick Maclaren]
> Well, the 'exact rounding' camp in IEEE 754 seem to think that there
> is one :-)
>
> As you can gather, I can't think of one.  Floating-point is an inherently
> inaccurate representation for anything other than small integers.

Well, bounded.  Python's decimal fp can handle millions of digits, if
you want that and are very patient ;-)

OTOH, I am a fan of analyzing FP operations as if the inputs were in
fact exactly what they claim to be, which 754 went a long way toward
popularizing.  That largely replaced mountains of idiosyncratic
"probabilistic arguments" (and where it seemed no two debaters ever
agreed on the "proper" approach)  with a common approach that
sometimes allows surprisingly sharp analysis.  Since I spent a good
part of my early career as a professional apologist for Seymour Cray's
"creative" floating point, I'm probably much more grateful to leave
sloppy arithmetic behind than most.

...

>> This [that IBM's spec calls for an exception if remainder's inputs'
>> exponents differ by "too much"] is a bit peculiar to me, because
>> there are ways to compute "remainder" using a number of operations
>> proportional to the log of the exponent difference.  It could be that
>> people who spend their life doing floating point forget how to work
>> with integers ;-)

> Aargh!  That is indeed the key!  Given that I claim to know something
> about integer arithmetic, too, how can I have been so STUPID?

You're not alone.  I thought of this a decade ago, but have never seen
it implemented, or even mentioned.  All implementations of fmod I've
seen emulate long binary division one bit at a time; the IBM spec
seems to assume that's how it would be done for decimal too (why else
mandate an exception instead if the exponent delta "is large"?); and
so on.

> Yes, you are right, and that is the only plausible way to calculate the
> remainder precisely.  You don't get the quotient precisely, which is
> what my (insane) specification would have provided.

And, alas, without the last bit of the quotient IBM's "remainder-near"
(== C99's "remainder" == 754's REM) can't resolve halfway cases in the
mandated way.

> I would nitpick with your example, because you don't want to reduce
> modulo 3.14 but modulo pi

I didn't have pi in mind at all.  I just picked 3.14 as an arbitrary
decimal modulus with few digits, to make the example easy to follow.
Could just as well have been 2.72 (which has nothing to do with e
either ;-)).

> and therefore the modular arithmetic is rather more expensive (given
> Decimal).  However, it STILL doesn't help to make remquo useful!
>
> The reason is that pi is input only to the floating-point precision,
> and so the result of remquo for very large arguments will depend
> more on the inaccuracy of pi as input than on the mathematical
> result.  That makes remquo totally useless for the example you quote.

That's not our choice to make.  Many math libraries still use a
"small" approximation to pi for trig argument reduction, and that's
their choice.  Heck, a 66-bit value for pi is built in to the
Pentium's FSIN and FCOS instructions.

>>> math.sin(math.pi)
1.2246063538223773e-016

That was on Windows with Python 2.5, using the MSVC compiler.  The
result indicates it uses the FSIN instruction.  Same thing but under
the Cygwin Python, whose libm uses "infinite precision pi" trig
reduction:

>>> math.sin(math.pi)
1.2246467991473532e-16

That one is "correct" (close to the best double approximation to the
sine of the best double approximation to pi).  The FSIN result is off
by about 164 billion(!) ULP, but few people care.

Anyway, I simply gave cosine arg reduction as /an example/ of the kind
of reduction strategy for which remquo can be useful.  You said you
were baffled by why C99 only required the last 3 bits.  The example
was a hint about why some functions can indeed find that to be useful,
not an exhaustive rationale.  It's really off-topic for Python-Dev, so
I didn't/don't want to belabor it.

> Yes, I have implemented 'precise' range reduction, and there is no
> substitute for using an arbitrary precision pi value :-(

I have too (for 754 doubles), coincidentally at the same time KC Ng
was implementing it for fdlibm.  I found some bugs in fdlibm's trig
functions at the time by generating billions of random inputs and
comparing his library-in-progress's results to mine.  That was fun.
My users (this was done for Kendall Square Research's libm) only
noticed that "the new" trig functions were much faster than the old
ones for small arguments -- nobody seemed to notice that the results
were much better than the old ones, or arrived at much more slowly,
for very large arguments.  How satisfying ;-)

>>> But it isn't obviously WRONG.

>> For floats, fmod(x, y) is exactly congruent to x modulo y -- I don't
>> think it's possible to get more right than exactly right ;-)

> But, as a previous example of yours pointed out, it's NOT exactly
> right.  It is also supposed to be in the range [0,y) and it isn't.
> -1%1e100 is mathematically wrong on two counts.
> 1a

No, /Python's/ definition of mod is inexact for that example.  fmod
(which is not Python's definition) is always exact:  fmod(-1, 1e100) =
-1, and -1 is trivially exactly congruent to -1 modulo anything
(including modulo 1e100).  The result of fmod(x, y) has the same sign
as x; Python's x.__mod__(y) has the same sign as y; and that makes all
the difference in the world as to whether the exact result is always
exactly representable as a float.