[Python-Dev] Re: marshal / unmarshal

Michael Hudson mwh at python.net
Tue Apr 12 09:39:08 CEST 2005


My mail is experincing random delays of up to a few hours at the
moment.  I wrote this before I saw your comments on my patch.

Tim Peters <tim.peters at gmail.com> writes:

> [Michael Hudson]
>> I've just submitted http://python.org/sf/1180995 which adds format
>> codes for binary marshalling of floats if version > 1, but it doesn't
>> quite have the effect I expected (see below):
>
>> >>> inf = 1e308*1e308
>> >>> nan = inf/inf
>> >>> marshal.dumps(nan, 2)
>> Traceback (most recent call last):
>>  File "<stdin>", line 1, in ?
>> ValueError: unmarshallable object
>
> I don't understand.  Does "binary marshalling" _not_ mean just copying
> the bytes on a 754 platform?

No, it means using _PyFloat_Pack8/Unpack8, like the patch description
says.  Making those functions just fiddle bytes when they can I regard
as a separate project (watch a patch manager near you, though).

> If so, that won't work.

I can tell! <wink>

>>> Right.  Assuming source and destination boxes both use 754 format, and
>>> the implementation adjusts endianess if necessary.
>
>> Well, I was assuming marshal would do floats little-endian-wise, as it
>> does for integers.
>
> Then on a big-endian 754 system, loads() will have to reverse the
> bytes in the little-endian marshal bytestring, and dumps() likewise. 

Really?  Even I had worked this out...

>>> Heh.  I have a vague half-memory of _some_ box that stored the two
>>> 4-byte "words" in an IEEE double in one order, but the bytes within
>>> each word in the opposite order.  It's always something ...
>
>> I recall stories of machines that stored the bytes of long in some
>> crazy order like that.  I think Python would already be broken on such
>> a system, but, also, don't care.
>
> Python does very little that depends on internal native byte order,
> and C hides it in the absence of casting abuse.  

This surely does:

PyObject *
PyLong_FromLongLong(PY_LONG_LONG ival)
{
        PY_LONG_LONG bytes = ival;
        int one = 1;
        return _PyLong_FromByteArray(
                (unsigned char *)&bytes,
                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
}

It occurs that in the IEEE case, special values can be detected with
reliablity -- by picking the exponent field out by force -- and a
warning emitted or exception raised.  Good idea?  Hard to say, to me.

Cheers,
mwh

Oh, by the way: http://python.org/sf/1181301

-- 
  It is time-consuming to produce high-quality software. However,
  that should not alone be a reason to give up the high standards
  of Python development.              -- Martin von Loewis, python-dev


More information about the Python-Dev mailing list