[Python-Dev] Re: marshal / unmarshal

Tim Peters tim.peters at gmail.com
Mon Apr 11 17:27:43 CEST 2005


[Tim]
>> The 754 standard doesn't say anything about how the difference between
>> signaling and quiet NaNs is represented.  So it's possible that a qNaN
>> on one box would "look like" an sNaN on a different box, and vice
>> versa.  But since most people run with all FPU traps disabled, and
>> Python doesn't expose a way to read the FPU status flags, they
>> couldn't tell the difference.

[mwh]
> OK.  Do you have any intuition as to whether 754 implementations
> actually *do* differ on this point?

Not anymore -- hasn't been part of my job, or a hobby, for over a
decade.  There were differences a decade+ ago.  All NaNs have all
exponent bits set, and at least one mantissa bit set, and every bit
pattern of that form represents a NaN.  That's all the standard says. 
The most popular way to distinguish quiet from signaling NaNs keyed
off the most-significant mantissa bit:  set for a qNaN, clear for an
sNaN.  It's possible that all 754 HW does that now.

There's at least still that Pentium hardware adds a third not-a-number
possibility: in addition to 754's quiet and signaling NaNs, it also
has "indeterminate" values.  Here w/ native Windows Python 2.4 on a
Pentium:

>>> inf = 1e300 * 1e300
>>> inf - inf   # indeterminate
-1.#IND
>>> - _  # but the negation of IND is a quiet NaN
1.#QNAN
>>>

Do the same thing under Cygwin Python on the same box and it prints "NaN" twice.

Do people care about this?  I don't know.  It seems unlikely -- in
effect, IND just gives a special string name to a single one of the
many bit patterns that represent a quiet NaN.  OTOH, Pentium hardware
still preserves this distinction, and MS library docs do too.  IND
isn't part of the 754 standard (although, IIRC, it was part of a
pre-standard draft, which Intel implemented and is now stuck with).

>> Copying bytes works perfectly for all other cases (signed zeroes,
>> non-zero finites, infinities), because their representations are
>> wholly defined, although it's possible that a subnormal on one box
>> will be treated like a zero (with the same sign) on a
>> partially-conforming box.

> I'd find struggling to care about that pretty hard.

Me too.

>>> The question, of course, is how to tell.

>> Store a few small doubles at module initialization time and stare at

> ./configure time, surely?

Unsure.  Not all Python platforms _have_ "./configure time".  Module
initialization code is harder to screw up for that reason (the code is
in an obvious place then, self-contained, and doesn't require any
relevant knowledge of any platform porter unless/until it breaks).

>> their bits.  That's enough to settle whether a 754 format is in use,
>> and, if it is, whether it's big-endian or little-endian.

> Do you have a pointer to code that does this?

No.  Pemberton's enquire.c contains enough code to do it.  Given how
few distinct architectures still exist, it's probably enough to store
just double x = 1.5 and stare at it.

>>> [2] Exaggeration, I realize -- but how many non 754 systems are out
>>>    there?  How many will see Python 2.5?

>> No idea here.  The existing pack routines strive to do a good job of
>> _creating_ an IEEE-754-format representation regardless of platform
>> representation.  I assume that code would still be present, so
>> "oddball" platforms would be left no worse off than they are now.
 
> Well, yes, given the above.  The text this footnote was attached to
> was asking if just assuming 754 float formats would inconvenience
> anyone.

I think I'm still missing your intent here.  If you're asking whether
Python can blindly assume that 745 is in use, I'd say that's
undesirable but defensible if necessary.


More information about the Python-Dev mailing list