[Python-Dev] Re: marshal / unmarshal

Michael Hudson mwh at python.net
Tue Apr 12 17:32:17 CEST 2005

Tim Peters <tim.peters at gmail.com> writes:

> ...
> [mwh]
>>>> I recall stories of machines that stored the bytes of long in some
>>>> crazy order like that.  I think Python would already be broken on such
>>>> a system, but, also, don't care.
> [Tim]
>>> Python does very little that depends on internal native byte order,
>>> and C hides it in the absence of casting abuse.
> [mwh]
>> This surely does:
>> PyObject *
>> PyLong_FromLongLong(PY_LONG_LONG ival)
>> {
>>        PY_LONG_LONG bytes = ival;
>>        int one = 1;
>>        return _PyLong_FromByteArray(
>>                (unsigned char *)&bytes,
>>                               SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
>> }
> Yes, that's "casting abuse'.  Python does very little of that.  If it
> becomes necessary, it's straightforward but long-winded to rewrite the
> above in wholly portable C (peel the bytes out of ival,
> least-signficant first, via shifting and masking 8 times; "ival &
> 0xff" is the least-significant byte regardless of memory storage
> order; etc).

Not arguing with that.

> BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and
> more deeply than does the visible cast there.

I'd like to claim that was part of my point :)

There is a certain, small level of assumption in Python that
"big-endian or little-endian" is the only question to ask -- and I
don't think that's a problem!

Even in this isn't a big deal, at least if we choose a more
interesting 'probe value' that 1.5, it will just lead to an oddball
box degrading to the non-ieee code.

>> It occurs that in the IEEE case, special values can be detected with
>> reliablity -- by picking the exponent field out by force
> Right, that works for NaNs and infinities; signed zeroes are a bit
> trickier to detect.

Hmm.  Don't think they're such a big deal.

>> -- and a warning emitted or exception raised.  Good idea?  Hard to
>> say, to me.
> It's not possible to _create_ a NaN or infinity from finite operands
> in 754 without signaling some exceptional condition.  Once you have
> one, though, there's generally nothing exceptional about _using_ it. 
> Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally. 
> Using a quiet NaN never signals; using a signaling NaN almost always
> signals.
> So packing a nan or inf shouldn't complain.  On a 754 box, unpacking
> one shouldn't complain either.  Unpacking a nan or inf on a non-754
> box probably should complain, since there's in general nothing it can
> be unpacked _to_ that makes any sense ("errors should never pass
> silently").

This sounds like good behaviour to me.  I'll try to update the patch


  BUGS   Never use this function.  This function modifies its first
         argument.   The  identity  of  the delimiting character is
         lost.  This function cannot be used on constant strings.
                                    -- the glibc manpage for strtok(3)

More information about the Python-Dev mailing list