[Python-Dev] Re: marshal / unmarshal

Mon Apr 11 22:08:12 CEST 2005

I've just submitted http://python.org/sf/1180995 which adds format
codes for binary marshalling of floats if version > 1, but it doesn't
quite have the effect I expected (see below):

>>> inf = 1e308*1e308
>>> nan = inf/inf
>>> marshal.dumps(nan, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: unmarshallable object

frexp(nan, &e), it turns out, returns nan, which results in this (to
be expected if you read _PyFloat_Pack8 and know that I'm using a
new-ish GCC -- it might be different for MSVC 6).

Also (this is the same thing, really):

>>> struct.pack('>d', inf)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: frexp() result out of range

Although I was a little surprised by this:

>>> struct.pack('d', inf)
'\x7f\xf0\x00\x00\x00\x00\x00\x00'

(this is a big-endian system).  Again, reading the source explains the
behaviour.

Tim Peters <tim.peters at gmail.com> writes:

> ...
>
> [mwh]
>> OK, so the worst that could happen here is that moving marshal data
>> from one box to another could turn one sort of NaN into another?
>
> Right.  Assuming source and destination boxes both use 754 format, and
> the implementation adjusts endianess if necessary.

Well, I was assuming marshal would do floats little-endian-wise, as it
does for integers.

> Heh.  I have a vague half-memory of _some_ box that stored the two
> 4-byte "words" in an IEEE double in one order, but the bytes within
> each word in the opposite order.  It's always something ...

I recall stories of machines that stored the bytes of long in some
crazy order like that.  I think Python would already be broken on such
a system, but, also, don't care.

>>>>> Store a few small doubles at module initialization time and stare at
>
>>>> ./configure time, surely?
>
>>> Unsure.  Not all Python platforms _have_ "./configure time".
>  
>> But they all have pyconfig.h.
>
> Yes, and then a platform porter has to understand what to
> #define/#undefine, and why.  People doing cross-compilation may have
> an especially confusing time of it.

Well, they can always not #define HAVE_IEEE_DOUBLES and not suffer all
that much (this is what I meant by false negatives below).

> Module initialization code "just works", so I certainly understand
> why it doesn't appeal to the Unix frame of mind <wink>.

It just strikes as silly to test at runtime sometime that is so
obviously not going to change between invocations.  But it's not a big
deal either way.

> ...
>
>> Something along these lines:
>>
>> double x = 1.5;
>> is_big_endian_ieee_double = sizeof(double) == 8 && \
>>       memcmp((char*)&x, "\077\370\000\000\000\000\000\000", 8);
>
> Right, it's that easy

Cool.

> -- at least under MSVC and gcc.

Huh?  Now it's my turn to be confused (for starters, under MSVC ieee
doubles really can be assumed...).

Cheers,
mwh 

-- 
  You sound surprised.  We're talking about a government department
  here - they have procedures, not intelligence.
                                            -- Ben Hutchings, cam.misc