[Python-Dev] Re: marshal / unmarshal
Michael Hudson
mwh at python.net
Tue Apr 12 17:32:17 CEST 2005
Tim Peters <tim.peters at gmail.com> writes:
> ...
>
> [mwh]
>>>> I recall stories of machines that stored the bytes of long in some
>>>> crazy order like that. I think Python would already be broken on such
>>>> a system, but, also, don't care.
>
> [Tim]
>>> Python does very little that depends on internal native byte order,
>>> and C hides it in the absence of casting abuse.
>
> [mwh]
>> This surely does:
>>
>> PyObject *
>> PyLong_FromLongLong(PY_LONG_LONG ival)
>> {
>> PY_LONG_LONG bytes = ival;
>> int one = 1;
>> return _PyLong_FromByteArray(
>> (unsigned char *)&bytes,
>> SIZEOF_LONG_LONG, IS_LITTLE_ENDIAN, 1);
>> }
>
> Yes, that's "casting abuse'. Python does very little of that. If it
> becomes necessary, it's straightforward but long-winded to rewrite the
> above in wholly portable C (peel the bytes out of ival,
> least-signficant first, via shifting and masking 8 times; "ival &
> 0xff" is the least-significant byte regardless of memory storage
> order; etc).
Not arguing with that.
> BTW, the IS_LITTLE_ENDIAN macro also relies on casting abuse, and
> more deeply than does the visible cast there.
I'd like to claim that was part of my point :)
There is a certain, small level of assumption in Python that
"big-endian or little-endian" is the only question to ask -- and I
don't think that's a problem!
Even in this isn't a big deal, at least if we choose a more
interesting 'probe value' that 1.5, it will just lead to an oddball
box degrading to the non-ieee code.
>> It occurs that in the IEEE case, special values can be detected with
>> reliablity -- by picking the exponent field out by force
>
> Right, that works for NaNs and infinities; signed zeroes are a bit
> trickier to detect.
Hmm. Don't think they're such a big deal.
>> -- and a warning emitted or exception raised. Good idea? Hard to
>> say, to me.
>
> It's not possible to _create_ a NaN or infinity from finite operands
> in 754 without signaling some exceptional condition. Once you have
> one, though, there's generally nothing exceptional about _using_ it.
> Sometimes there is, like +Inf - +Inf or Inf / Inf, but not generally.
> Using a quiet NaN never signals; using a signaling NaN almost always
> signals.
>
> So packing a nan or inf shouldn't complain. On a 754 box, unpacking
> one shouldn't complain either. Unpacking a nan or inf on a non-754
> box probably should complain, since there's in general nothing it can
> be unpacked _to_ that makes any sense ("errors should never pass
> silently").
This sounds like good behaviour to me. I'll try to update the patch
soon.
Cheers,
mwh
--
BUGS Never use this function. This function modifies its first
argument. The identity of the delimiting character is
lost. This function cannot be used on constant strings.
-- the glibc manpage for strtok(3)
More information about the Python-Dev
mailing list