[Python-Dev] Numerical robustness, IEEE etc.

Fri Jun 23 15:24:23 CEST 2006

[Kevin Jacobs]
> ...
> A good place to start: You mentioned earlier that there where some
> nonsensical things in floatobject.c.  Can you list some of the most serious
> of these?

I suspect Nick spends way too much time reading standards ;-)  What he said is:

    If you look at floatobject.c, you will find it solid with constructions that
    make limited sense in C99 but next to no sense in C89.

And, in fact, C89 truly defines almost nothing about floating-point
semantics or pragmatics.  Nevertheless, if a thing "works" under gcc
and under MS C, then "it works" for something like 99.9% of Python's
users, and competitive pressures are huge for other compiler vendors
to play along with those two.

I don't know what specifically Nick had in mind, and join the chorus
asking for specifics.  I _expect_ he's got a keen eye for genuine
coding bugs here, but also expect I'd consider many technically
dubious bits of FP code to be fine under the "de facto standard"
dodge.

For an example much worse than anything in floatobject.c _could_ be, C
doesn't guarantee that an "int" can hold an integer larger than 32767.
 I doubt Python would run _at all_ on a platform with 16-bit ints, and
the "de facto standard" Python actually tries to follow is that an int
holds at least 32 bits.  Even more fundamental than that, we use
-fno-strict-aliasing under gcc because our C code is in fact in
undefined-behavior-land all over the place when casting between
PyObject* and pointers to objects of conceptual PyObject subclasses.
Because of the _way_ these casts are done, we're almost certainly in
no real trouble under any compiler, and the real reason we use
-fno-strict-aliasing under gcc is just to shut up the annoying warning
messages.

WRT floatobject.c, part of Python's assumed "de facto C standard" is
that no FP operation will raise SIGFPE, or cause "abnormal"
termination of the program by any other means.  C89 doesn't say say
squat about that of any real use.  C99 kinda does, sometimes, under
some conditions.  The IEEE-754 standard mandates "no-stop mode" for
its default numeric environment, and Python effectively assumes that,
and _forces_ it on platforms where it's not the default.  The only
known platform to date on which it was necessary to do so can be
deduced from Python's main() function:

int
main(int argc, char **argv)
{
...
#ifdef __FreeBSD__
	fp_except_t m;

	m = fpgetmask();
	fpsetmask(m & ~FP_X_OFL);
#endif
	return Py_Main(argc, argv);
}

So, sure, everything we do is undefined, but, no, we don't really care
:-)  If a non-trivial 100%-guaranteed-by-the-standard-to-work C
program exists, I don't think I've seen it.

Changes in float behavior really have to go thru the numerical Python
users, because they have the largest stake.  From that community, by
far the most frequent request | hear (even to the extent that people
seek me out at conferences and hint that they're willing to throw
money at it ;-)) is for a way to stop Python from raising
ZeroDivisionError on float divides.  They may or may not be aware of
the dubious justifications for the no-stop results IEEE-754 mandates
for the various div-by-0 sub-cases, but they're eager to live with
them regardless.  Even those who agree those "should be" errors (seems
to be a minority these days) would rather search result arrays for
NaNs and Infs after-the-fact than litter masses of array-syntax-driven
computation with hard-to-get-right recovery code.  For a stupid
example, in

    a = b / c

you may be dividing millions of array elements:  do you really want to
throw away all the work you've done so far just because division
#1384923 divided by 0?  "Yes" if you think the entire world is beyond
salvation if that happens, but that's rarely the case.  When writing a
_scalar_ loop you generally don't mind catching exceptions, or even
doing

    if c[i]:
        a[i] = b[i] / c[i]

You can treat each computation specially, because you're only _doing_
one computation at a time.  When computing with giant aggregates,
exceptions can be much harder to live with.

BTW, Nick, are you aware of Python's fpectl module?  That's
user-contributed code that attempts to catch overflow, div-by-0, and
invalid operation on 754 boxes and transform them  into raising a
Python-level FloatingPointError exception.  Changes were made all over
the place to try to support this at the time.  Every time you see a
PyFPE_START_PROTECT or PyFPE_END_PROTECT macro in Python's C code,
that's the system it's trying to play nicely with.  "Normally", those
macros have empty expansions.

fpectl is no longer built by default, because repeated attempts failed
to locate any but "ya, I played with it once, I think" users, and the
masses of platform-specific #ifdef'ery in fpectlmodule.c were
suffering fatal bit-rot.  No users + no maintainers means I expect
it's likely that module will go away in the foreseeable future.  You'd
probably hate its _approach_ to this anyway ;-)