[Python-Dev] Numerical robustness, IEEE etc.
nmm1 at cus.cam.ac.uk
Mon Jun 19 10:55:44 CEST 2006
Brett Cannon's and Neal Norwitz's replies appreciated and noted, but
responses sent by mail.
Nick Coghlan <ncoghlan at gmail.com> wrote:
> Python 2.4's decimal module is, in essence, a floating point emulator based on
> the General Decimal Arithmetic specification.
Grrk. Format and all? Because, in software, encoding, decoding and
dealing with the special cases accounts for the vast majority of the
time. Using a format and specification designed for implementation
in software is a LOT faster (often 5-20 times).
> If you want floating point mathematics that doesn't have insane platform
> dependent behaviour, the decimal module is the recommended approach. By the
> time Python 2.6 rolls around, we will hopefully have an optimized version
> implemented in C (that's being worked on already).
Yes. There is no point in building a wheel if someone else is doing it.
Please pass my name on to the people doing the optimisation, as I have
a lot of experience in this area and may be able to help. But it is a
fairly straightforward (if tricky) task.
> That said, I'm not clear on exactly what changes you'd like to make to the
> binary floating point type, so I don't know if I think they're a good idea or
> not :)
Now, here it is worth posting a reponse :-)
The current behaviour follows C99 (sic) with some extra checking (e.g.
division by zero raises an exception). However, this means that a LOT
of errors will give nonsense answers without comment, and there are a
lot of ways to 'lose' NaN values quietly - e.g. int(NaN). That is NOT
good software engineering. So:
Mode A: follow IEEE 754R slavishly, if and when it ever gets into print.
There is no point in following C99, as it is too ill-defined, even if it
were felt desirable. This should not be the default, because of the
flaws I mention above (see Kahan on Java).
Mode B: all numerically ambiguous or invalid operations should raise
an exception - including pow(0,0), int(NaN) etc. etc. There is a moot
point over whether overflow is such a case in an arithmetic that has
infinities, but let's skip over that one for now.
Mode C: all numerically ambiguous or invalid operations should return
a NaN (or infinity, if appropriate). Anything that would lose the error
indication would raise an exception. The selection between modes B and
C could be done by a method on the class - with mode B being selected
if any argument had it set, and mode C otherwise.
Now, both modes B and C are traditional approaches to numerical safety,
and have the property that error indications can't be lost "by accident",
though they make no guarantees that the answers make sense. I am
agnostic about which is better, though mode B is a LOT better from the
debugging point of view, as you discover an error closer to where it
Heaven help us, there could be a mode D, which would be mode C but
with trace buffers. They are another sadly neglected software
engineering technique, but let's not add every bell and whistle on
the shelf :-)
"tjreedy" <tjreedy at udel.edu> wrote:
> > experience from times of yore is that emulated floating-point would
> > be fast enough that few, if any, Python users would notice.
> Perhaps you should enquire on the Python numerical and scientific computing
> lists to see how many feel differently. I don't see how someone crunching
> numbers hours per day could not notice a slowdown.
Oh, certainly, almost EVERYONE will "feel" differently! But that is
not the point. Those few of us remaining (and there are damn few) who
know how a fast emulated floating-point performs know that the common
belief that it is very slow is wrong. I have both used and implemented
The point is, as I mention above, you MUST use a software-friendly
format AND specification if you want performance. IEEE 754 and IBM's
decimal pantechnichon are both extremely software-hostile.
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1 at cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679
More information about the Python-Dev