pickle broken: can't handle NaN or Infinity under win32

Wed Jun 22 20:46:49 EDT 2005

[with the start of US summer comes the start of 754 ranting season]

[Grant Edwards]
>>>> Negative 0 isn't a NaN, it's just negative 0.

[Scott David Daniels]
>>> Right, but it is hard to construct in standard C.

[Paul Rubin]
>> Huh?  It's just a hex constant.

[Scott David Daniels]
> Well, -0.0 doesn't work,

C89 doesn't define the result of that, but "most" C compilers these
days will create a negative 0.

> and (double)0x80000000 doesn't work,

In part because that's an integer <wink>, and in part because it's
only 32 bits.  It requires representation casting tricks (not
conversion casting tricks like the above), knowledge of the platform
endianness, and knowledge of the platform integer sizes.  Assuming the
platform uses 754 bit layout to begin with, of course.

> and.... I think you have to use quirks of a compiler to create
> it.

You at least need platform knowledge.  It's really not hard, if you
can assume enough about the platform.

>  And I don't know how to test for it either, x < 0.0 is
> not necessarily true for negative 0.

If it's a 754-conforming C compiler, that's necessarily false (+0 and
-0 compare equal in 754).  Picking the bits apart is again the closest
thing to a portable test.  Across platforms with a 754-conforming
libm, the most portable way is via using atan2(!):

>>> pz = 0.0
>>> mz = -pz
>>> from math import atan2
>>> atan2(pz, pz)
0.0
>>> atan2(mz, mz)
-3.1415926535897931

It's tempting to divide into 1, then check the sign of the infinity,
but Python stops you from doing that:

>>> 1/pz
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ZeroDivisionError: float division

That can't be done at the C level either, because _some_ people run
Python with their 754 HW floating-point zero-division, overflow, and
invalid operation traps enabled, and then anything like division by 0
causes the interpreter to die.  The CPython implementation is
constrained that way.

Note that Python already has Py_IS_NAN and Py_IS_INFINITY macros in
pyport.h, and the Windows build maps them to appropriate
Microsoft-specific library functions.  I think it's stuck waiting on
others to care enough to supply them for other platforms.  If a
platform build doesn't #define them, a reasonable but cheap attempt is
made to supply "portable" code sequences for them, but, as the
pyport.h comments note, they're guaranteed to do wrong things in some
cases, and may not work at all on some platforms.  For example, the
default

#define Py_IS_NAN(X) ((X) != (X))

is guaranteed never to return true under MSVC 6.0.

> I am not trying to say there is no way to do this.  I am
> trying to say it takes thought and effort on every detail,
> in the definition, implementations, and unit tests.

It's par for the course -- everyone thinks "this must be easy" at
first, and everyone who persists eventually gives up.  Kudos to
Michael Hudson for persisting long enough to make major improvements
here in pickle, struct and marshal for Python 2.5!