[Python-Dev] Re: marshal / unmarshal

Michael Hudson mwh at python.net
Tue Apr 12 17:32:17 CEST 2005


Tim Peters <tim.peters at gmail.com> writes:

> ...
>
> [mwh]
>>>> I recall stories of machines that stored the bytes of long in some
>>>> crazy order like that.  I think Python would already be broken on such
>>>> a system, but, also, don't care.
>
> [Tim]
>>> Python does very little that depends on internal native byte order,
>>> and C hides it in the absence of casting abuse.
>
> [mwh]
>> This surely does:
>>
>> PyObject *
>> PyLong_FromLongLong(PY_LONG_LONG ival)
>> {
>>        PY_LONG_LONG bytes = ival;
>>        int one = 1;
>>        return _PyLong_FromByteArray(
>>                (unsigned char *)&bytes,
>>                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
>> }
>
> Yes, that's "casting abuse'.  Python does very little of that.  If it
> becomes necessary, it's straightforward but long-winded to rewrite the
> above in wholly portable C (peel the bytes out of ival,
> least-signficant first, via shifting and masking 8 times; "ival &
> 0xff" is the least-significant byte regardless of memory storage
> order; etc).

Not arguing with that.

> BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and
> more deeply than does the visible cast there.

I'd like to claim that was part of my point :)

There is a certain, small level of assumption in Python that
"big-endian or little-endian" is the only question to ask -- and I
don't think that's a problem!

Even in this isn't a big deal, at least if we choose a more
interesting 'probe value' that 1.5, it will just lead to an oddball
box degrading to the non-ieee code.

>> It occurs that in the IEEE case, special values can be detected with
>> reliablity -- by picking the exponent field out by force
>
> Right, that works for NaNs and infinities; signed zeroes are a bit
> trickier to detect.

Hmm.  Don't think they're such a big deal.

>> -- and a warning emitted or exception raised.  Good idea?  Hard to
>> say, to me.
>
> It's not possible to _create_ a NaN or infinity from finite operands
> in 754 without signaling some exceptional condition.  Once you have
> one, though, there's generally nothing exceptional about _using_ it. 
> Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
> Using a quiet NaN never signals; using a signaling NaN almost always
> signals.
>
> So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
> one shouldn't complain either.  Unpacking a nan or inf on a non-754
> box probably should complain, since there's in general nothing it can
> be unpacked _to_ that makes any sense ("errors should never pass
> silently").

This sounds like good behaviour to me.  I'll try to update the patch
soon.

Cheers,
mwh

-- 
  BUGS   Never use this function.  This function modifies its first
         argument.   The  identity  of  the delimiting character is
         lost.  This function cannot be used on constant strings.
                                    -- the glibc manpage for strtok(3)


More information about the Python-Dev mailing list