[ python-Bugs-1003935 ] xrange overflows

SourceForge.net noreply at sourceforge.net
Sun Aug 8 20:09:33 CEST 2004


Bugs item #1003935, was opened at 2004-08-05 15:16
Message generated for change (Comment added) made by hfuru
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1003935&group_id=5470

>Category: Python Interpreter Core
Group: Python 2.3
>Status: Open
>Resolution: None
Priority: 5
Submitted By: Hallvard B Furuseth (hfuru)
Assigned to: Tim Peters (tim_one)
Summary: xrange overflows

Initial Comment:
These restrictions are undocumented both in the
xrange doc string and in the reference manual
(Info node 'XRange Type'):

  >>> xrange(maxint, maxint + 10)
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  OverflowError: long int too large to convert to int

  >>> xrange(-100, maxint)
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  OverflowError: xrange() result has too many items

I hope the overflows below are bugs and not
features.  It works if 3/-3 is replaced with 1/-1:

  >>> xrange(0, maxint, 3)
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  OverflowError: integer addition

  >>> xrange(0, -maxint, -3)
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  OverflowError: integer addition

Python installation:

  Python 2.3.3 (#1, May 25 2004, 20:22:36) 
  [GCC 3.2.3] on sunos5
  Type "help", "copyright", "credits" or "license"
for    more information.
  >>> from sys import maxint
  >>> "%x" % maxint
  '7fffffff'


----------------------------------------------------------------------

>Comment By: Hallvard B Furuseth (hfuru)
Date: 2004-08-08 20:09

Message:
Logged In: YES 
user_id=726647

Here is why repr() is relevant - though the error message
was certainly weird.  With the latest CVS version:

  >>> xrange(maxint-1, maxint, 2)
  xrange(2147483646, -2147483648, 2)

Which is why I suggested to return last+step/abs(step)
as the 2nd argument.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-08-08 09:20

Message:
Logged In: YES 
user_id=31435

Changed range_new() to stop using PyRange_New().  This 
fixes a variety of bogus errors.

Documented that xrange() is intended to be simple and fast, 
and that CPython restricts its arguments, and length of its 
result sequence, to native C longs.

Added some tests that failed before the patch, and repaired 
a test that relied on a bogus OverflowError getting raised.

Doc/lib/libfuncs.tex; new revision: 1.171
Lib/test/test_xrange.py; new revision: 1.2
Objects/rangeobject.c; new revision: 2.53

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2004-08-06 07:10

Message:
Logged In: YES 
user_id=31435

It looks like there are two bugs here.  One is that the "integer 
addition" detail doesn't make sense, since the user isn't doing 
any integer addition here (sorry, no, repr() is irrelevant to 
this).

Second, it shouldn't be complaining in the last two cases at 
alll.  If the numbers truly were out of range, then 
rangeobject.c's range_new() would have raised a "too many 
items" exception.  Note:

>>> from sys import maxint as m
>>> xrange(0, m, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
OverflowError: integer addition
>>> xrange(-m, m, 2)
xrange(-2147483647, 2147483647, 2)
>>>

The second xrange() there contains twice as many items as 
the first one, but doesn't complain.  It's code in PyRange_New
() that's making the bogus complaint, and I can't figure out 
what it thinks it's doing.

The code in get_len_of_range() is correct.

The code in PyRange_New() is both overly permissive (e.g., it 
silently lets "(len - 1) * step" overflow), and overly restrictive 
(e.g, I can't see why it should matter if "last > (PyInt_GetMax
() - step))" -- the only thing in that specific branch that 
*should* matter is whether the integer addition "start + (len -
 1) * step" overflowed (which it isn't checking for correctly, 
even assuming the multiplication didn't overflow).

The obvious fix for xrange() is to speed range_new() by 
throwing away its call to the broken PyRange_New().  
range_new() is already doing a precise job of checking 
for "too big", and already knows everything it needs to 
construct the right rangeobject.

That would leave the PyRange_New() API call with broken 
overflow checking, but it's not called from anywhere else in 
the core.

----------------------------------------------------------------------

Comment By: Hallvard B Furuseth (hfuru)
Date: 2004-08-05 18:42

Message:
Logged In: YES 
user_id=726647

The only reason I can see that xrange(0, maxint, 3)
gives integer overflow is that repr() returns
'xrange(first, last + step, step)',
where last + step would be too large.

I suggest repr() is changed to return
'xrange(first, last + step/abs(step), step)'.

----------------------------------------------------------------------

Comment By: Hallvard B Furuseth (hfuru)
Date: 2004-08-05 17:14

Message:
Logged In: YES 
user_id=726647

> Do you have a real use case for this?

For the 'hopefully bugs' variants, yes:

#1: Loop forever:

  for i in xrange(x, sys.maxint, y)

That's a lot faster than

  i = x
  while True: ... i += y

#2: 'loop until optional upper bound':

  def some_loop(start, end = sys.maxint):
    for i in xrange(start, end, whatever())

> Do any real apps need to loop over more than
> sys.maxint integers?

The answer may be yes nowadays.  Even my old
Solaris can find primes up to maxint/2 in just
2 hours.  That's a loop over maxint/4 integers.
Though the remaining 3/4 come slower:-)

Still, I expect variants of the above code would
be less uncommon, like some_loop(-100).

> It would be ashamed to muckup the high
> performance implementation for something that
> does not arise in practice.

I hope you do not mean xrange(0, maxint, 3).

If you mean xrange(-100, maxint): Maybe xrange
could be split in several types (fast and slower)
and the xrange() operator would return one of
these, similar to how int() now can return long?


----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-08-05 17:12

Message:
Logged In: YES 
user_id=21627

The OverflowErrors are definitely intentional, and by
design. xrange() is documented as working the same way as
range(), so any error in the documentation is an error in
the range() documentation.

Reclassifying this as a documentation bug.

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-08-05 15:49

Message:
Logged In: YES 
user_id=80475

Do you have a real use case for this?  Do any real apps need
to loop over more than sys.maxint integers?  It would be
ashamed to muckup the high performance implementation for
something that does not arise in practice.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1003935&group_id=5470


More information about the Python-bugs-list mailing list