[Patches] [Patch #103158] Don't do unsafe arithmetic in xrange object

noreply@sourceforge.net noreply@sourceforge.net
Fri, 12 Jan 2001 12:44:58 -0800


Patch #103158 has been updated. 

Project: python
Category: core (C code)
Status: Open
Submitted by: greg_ball
Assigned to : gvanrossum
Summary: Don't do unsafe arithmetic in xrange object

Follow-Ups:

Date: 2001-Jan-12 12:44
By: gvanrossum

Comment:
In theory, there's a use for objects whose length doesn't fit in an int:
for i in xrange(...) is routinely used for loops that expect to break well
before the end but don't want to give an arbitrary upper bound.

But I agree with Tim that this serves no real purpose.

I think that if anything, the xrange() object should be scaled down.  No
repetition, no concationation, no slicing.  the only real use ought to be
"for i in xrange(...)".  But of course, *that* is bound to break somebody's
code somewhere.  So let's not go there.

But let's not complicate it any further.  Prevent breakage, but don't
stretch the functionality beyond what it currently does.

-------------------------------------------------------

Date: 2001-Jan-12 12:35
By: tim_one

Comment:
Please do reject objects whose length doesn't fit into an int.  I'll bet a
dollar that no real code in the world will break as a result.  I feel
confident making that bet, because if someone's code *does* break as a
result, it's worth a dollar to me to see Guido lecture them that their code
was already broken <wink>.

Really, it's very Pythonic to gripe about a problem just as soon as it
crops up.  It's not doing anyone a favor to delay griping until the object
is used:  at that point, it may be very difficult to figure out how the
object *got* damaged.  There's simply no payback in allowing creation of an
unusable xrange object, and, as you say, life is simpler if you don't allow
it.

-------------------------------------------------------

Date: 2001-Jan-12 12:11
By: greg_ball

Comment:
The new version of the patch makes the code a bit less complicated.  (Only
error messages have changed in terms of behaviour; regression test run
under linux and OSF1.)  If it looks ok you can ignore the rest of this
comment...

I originally made totlen an int because it is the return value from
range_length, and the argument to  PyList_New().  The current patch makes
totlen a long, but the ocde doesn't get hugely less complicated.

The code would get simpler if it rejected objects
whose length doesn't fit into a int, but that seemed too easy and there is
a small chance someone's code might break. On the other hand by rejecting
objects whose stop
can't be correctly calculated, I _might_ have broken code already.  Break
just means "now raises exception" though, nothing insidious.


-------------------------------------------------------

Date: 2001-Jan-10 10:28
By: gvanrossum

Comment:
I wonder why you made totlen an int.  This seems to complicate the code
quite a bit.  I'd suggest making it a long.  Then you have to test whether
it's > INT_MAX only in tolist() -- in that case you should raise
MemoryError.

Other than that, it looks decent!

-------------------------------------------------------

Date: 2001-Jan-09 06:31
By: greg_ball

Comment:
This is a replacement for patch 103129,
as suggested by tim_one.

Possibly interesting aspects of the patch:

- xrange objects with overflowing lengths are not gratutiously rejected. 
An error is raised only if the length becomes relevant, under len() or
negative indexing.  Bad lengths are stored as -1. Objects with bad lengths
allow any non-negative integer index.

- On 64 bit platforms, sequence multiplication seems to cause an unsafe
cast from long to int.  This patch does not address that issue.

Without my patch, on i686 Linux:
Python 2.0 (#25, Jan  2 2001, 12:18:07)
[GCC 2.95.3 19991030 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.
>>> from sys import maxint
>>> x = xrange(maxint) * 2
>>> len(x)
-2
>>> x[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: error return without exception set
>>> y = xrange(maxint/2)*3
>>> len(y)
-1073741827
>>> y[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: error return without exception set                            
   
>>> y.tolist()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: listobject.c:33: bad argument to internal function
>>> m = maxint >> 16
>>> xrange(0,maxint,m)
xrange(0, -2147450883, 32767)
>>> (xrange(1)*maxint)*maxint
xrange(1)
>>>  

With my patch
Python 2.0 (#35, Jan  6 2001, 12:54:41)
[GCC 2.95.3 19991030 (prerelease)] on linux2
Type "copyright", "credits" or "license" for more information.             
   
>>> from sys import maxint
>>> x = xrange(maxint) * 2
>>> len(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> x[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> y = xrange(maxint/2)*3
>>> len(y)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> y.tolist()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> m = maxint >> 16
>>> xrange(0,maxint,m)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: integer addition
>>> (xrange(1)*maxint) * maxint
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: integer multiplication
>>>  

Without patch on Digital Unix
Python 2.0 (#5, Jan  6 2001, 15:42:58) [C] on osf1V4
Type "copyright", "credits" or "license" for more information.
>>> from sys import maxint
>>> m=maxint >> 32
>>> x = xrange(maxint)*2
>>> x
(xrange(9223372036854775807) * 2)
>>> len(x)
-2
>>> x[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
SystemError: error return without exception set
>>> xrange(0,maxint,m)
xrange(0, -9223372034707292163, 2147483647)
>>> xrange(1)*maxint
xrange(0)
>>> 

With patch
Python 2.0 (#7, Jan  6 2001, 15:52:21) [C] on osf1V4
Type "copyright", "credits" or "license" for more information.
>>> from sys import maxint
>>> m=maxint >> 32
>>> x = xrange(maxint)*2
>>> x
(xrange(9223372036854775807) * 2)
>>> len(x)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> x[-1]
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: xrange object too long
>>> xrange(0,maxint,m)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: integer addition
>>> xrange(1)*maxint
xrange(0)
>>> 

-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=103158&group_id=5470