[Python-Dev] Bad interaction of __index__ and sequence repeat
Nick Coghlan
ncoghlan at gmail.com
Fri Jul 28 17:29:19 CEST 2006
David Hopwood wrote:
> Armin Rigo wrote:
>> Hi,
>>
>> There is an oversight in the design of __index__() that only just
>> surfaced :-( It is responsible for the following behavior, on a 32-bit
>> machine with >= 2GB of RAM:
>>
>> >>> s = 'x' * (2**100) # works!
>> >>> len(s)
>> 2147483647
>>
>> This is because PySequence_Repeat(v, w) works by applying w.__index__ in
>> order to call v->sq_repeat. However, __index__ is defined to clip the
>> result to fit in a Py_ssize_t.
>
> Clipping the result sounds like it would *never* be a good idea. What was
> the rationale for that? It should throw an exception.
A simple demonstration of the clipping behaviour that works on machines with
limited memory:
>>> (2**100).__index__()
2147483647
>>> (-2**100).__index__()
-2147483648
PEP 357 doesn't even mention the issue, and the comment on long_index in the
code doesn't give a rationale - it just notes that the function clips the result.
Neither the PyNumber_AsIndex nor the __index__ documentation mention anything
about the possibility of clipping, and there's no test case to verify this
behaviour.
I'm inclined to call it a bug, too, but I've cc'ed Travis to see if he can
shed some light on the question - the implementation of long_index explicitly
suppresses the overflow error generated by _long_as_ssize_t, so the current
behaviour appears to be deliberate.
Cheers,
Nick.
--
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
---------------------------------------------------------------
http://www.boredomandlaziness.org
More information about the Python-Dev
mailing list