[Python-Dev] Numerical robustness, IEEE etc.

Sun Jun 25 22:49:49 CEST 2006

jacobs at bioinformed.com wrote:
> 
> I'm not asking you to describe SC22WG14 or post detailed technical summaries
> of the long and painful road.  I'd like you to post things directly relevant
> to Python with footnotes to necessary references.  It is then incumbent on
> those that wish to respond to your post to familiarize themselves with the
> relevant background material.  However, it is really darn hard to do that
> when we don't know what you're trying to fix in Python.  The examples you
> show below are a good start in that direction.

Er, no.  Given your response, it has merely started off a hare.  The
issues you raise are merely ones of DETAIL, and I was and am trying
to tackle the PRINCIPLE (a.k.a. design).

I originally stated my objective, and asked for information so that I
could investigate in depth and produce (in some order) a sandbox and
a PEP.  That is still my plan.

This example was NOT of problems with the existing implementation,
but was to show how even the most basic numeric code that attempts to
handle errors cannot avoid tripping over the issues.  I shall respond
to your points, but shall try to refrain from following up.

> 1) The string representation of NaN is not standardized across platforms

Try what I actually used:

    x = 1.0e300
    x = (x*x)/(x*x)

I converted that to float('NaN') to avoid confusing people.  There
are actually many issues around the representation of NaNs, including
whether signalling NaNs should be separated from quiet NaNs and whether
they should be allowed to have values.  See IEEE 754, IEEE 754R and
C99 for more details (but not clarification).

> 2) on a sane platform, int(float('NaN')) should raise an ValueError
> exception for the int() portion.

Well, I agree with you, but Java and many of the C99 people don't.

> 3) float('NaN') == float('NaN') should be false, assuming NaN is not a
> signaling NaN, by default

Why?  Why should it not raise ValueError?  See table 4 in IEEE 754.
I could go into this one in much more depth, but let's not, at least
not now.

> So the open question is how to both define the semantics of Python floating
> point operations and to implement them in a way that verifiably works on the
> vast majority of platforms without turning the code into a maze of
> platform-specific defines, kludges, or maintenance problems waiting to
> happen.

Well, in a sense, but the second is really a non-question - i.e. it
answers itself almost trivially once the first is settled.  ALL of your
above points fall into that category.  The first question to answer is
what the fundamental model should be, and I need to investigate in
more depth before commenting on that - which should tell you roughly
what I know and what I don't about the decimal model.

The best way to get a really ghastly specification is to decide on
the details before agreeing on the intent.  Committees being what they
are, that is a recipe for something that nobody else will ever get
their heads around.

Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  nmm1 at cam.ac.uk
Tel.:  +44 1223 334761    Fax:  +44 1223 334679