[Python-Dev] Re: marshal / unmarshal

Mon Apr 11 17:27:43 CEST 2005

[Tim]
>> The 754 standard doesn't say anything about how the difference between
>> signaling and quiet NaNs is represented.  So it's possible that a qNaN
>> on one box would "look like" an sNaN on a different box, and vice
>> versa.  But since most people run with all FPU traps disabled, and
>> Python doesn't expose a way to read the FPU status flags, they
>> couldn't tell the difference.

[mwh]
> OK.  Do you have any intuition as to whether 754 implementations
> actually *do* differ on this point?

Not anymore -- hasn't been part of my job, or a hobby, for over a
decade.  There were differences a decade+ ago.  All NaNs have all
exponent bits set, and at least one mantissa bit set, and every bit
pattern of that form represents a NaN.  That's all the standard says. 
The most popular way to distinguish quiet from signaling NaNs keyed
off the most-significant mantissa bit:  set for a qNaN, clear for an
sNaN.  It's possible that all 754 HW does that now.

There's at least still that Pentium hardware adds a third not-a-number
possibility: in addition to 754's quiet and signaling NaNs, it also
has "indeterminate" values.  Here w/ native Windows Python 2.4 on a
Pentium:

>>> inf = 1e300 * 1e300
>>> inf - inf   # indeterminate
-1.#IND
>>> - _  # but the negation of IND is a quiet NaN
1.#QNAN
>>>

Do the same thing under Cygwin Python on the same box and it prints "NaN" twice.

Do people care about this?  I don't know.  It seems unlikely -- in
effect, IND just gives a special string name to a single one of the
many bit patterns that represent a quiet NaN.  OTOH, Pentium hardware
still preserves this distinction, and MS library docs do too.  IND
isn't part of the 754 standard (although, IIRC, it was part of a
pre-standard draft, which Intel implemented and is now stuck with).

>> Copying bytes works perfectly for all other cases (signed zeroes,
>> non-zero finites, infinities), because their representations are
>> wholly defined, although it's possible that a subnormal on one box
>> will be treated like a zero (with the same sign) on a
>> partially-conforming box.

> I'd find struggling to care about that pretty hard.

Me too.

>>> The question, of course, is how to tell.

>> Store a few small doubles at module initialization time and stare at

> ./configure time, surely?

Unsure.  Not all Python platforms _have_ "./configure time".  Module
initialization code is harder to screw up for that reason (the code is
in an obvious place then, self-contained, and doesn't require any
relevant knowledge of any platform porter unless/until it breaks).

>> their bits.  That's enough to settle whether a 754 format is in use,
>> and, if it is, whether it's big-endian or little-endian.

> Do you have a pointer to code that does this?

No.  Pemberton's enquire.c contains enough code to do it.  Given how
few distinct architectures still exist, it's probably enough to store
just double x = 1.5 and stare at it.

>>> [2] Exaggeration, I realize -- but how many non 754 systems are out
>>>    there?  How many will see Python 2.5?

>> No idea here.  The existing pack routines strive to do a good job of
>> _creating_ an IEEE-754-format representation regardless of platform
>> representation.  I assume that code would still be present, so
>> "oddball" platforms would be left no worse off than they are now.

> Well, yes, given the above.  The text this footnote was attached to
> was asking if just assuming 754 float formats would inconvenience
> anyone.

I think I'm still missing your intent here.  If you're asking whether
Python can blindly assume that 745 is in use, I'd say that's
undesirable but defensible if necessary.