[Python-bugs-list] [ python-Bugs-556025 ] list(xrange(1e9)) --> seg fault
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 13 Aug 2002 04:43:58 -0700
Bugs item #556025, was opened at 2002-05-14 08:41
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=556025&group_id=5470
Category: Python Interpreter Core
Group: None
>Status: Closed
Resolution: Accepted
Priority: 5
Submitted By: Raymond Hettinger (rhettinger)
Assigned to: Jason Tishler (jlt63)
Summary: list(xrange(1e9)) --> seg fault
Initial Comment:
>From c.lang.py:
'''
Python 2.2.1 (#2, Apr 21 2002, 22:22:55)
[GCC 2.96 20000731 (Mandrake Linux 8.1 2.96-0.62mdk)]
on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> list(xrange(1e9))
Segmentation fault
'''
I've reproduced the same fault on Windows ME using
Py2.2.0 and Py2.3a.
----------------------------------------------------------------------
>Comment By: Jason Tishler (jlt63)
Date: 2002-08-13 03:43
Message:
Logged In: YES
user_id=86216
Committed as Lib/test/test_b1.py 1.51.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-08-12 05:38
Message:
Logged In: YES
user_id=33168
You should probably add that it fails due to a bug in newlib
and not python. Go ahead and check it in and close the bug
report. Thanks.
----------------------------------------------------------------------
Comment By: Jason Tishler (jlt63)
Date: 2002-08-12 03:26
Message:
Logged In: YES
user_id=86216
rhettinger wrote:
> My thought is to close the bug, but add a
> unittest that says in effect: if os is cygwin
> and version(cygwin) >= 1.3.13 and the bug still
> exists, then fail with a message saying that SF
> 556025 was not successfully resolved.
Do we really want to add cruft that is not only
Cygwin specific but Cygwin version specific?
nnorwitz wrote:
> I'm not sure if the test should fail, be
> skipped, or not run for cygwin < 1.3.13.
Agreed.
> But let's at least put a comment in the test and
> close this bug report. Jason, can you do that?
Yes, but I only have pre-approved commit access to
the CVS repository. Can you approve the attached
patch?
> Good persistence Jason!
Thanks.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-08-10 06:27
Message:
Logged In: YES
user_id=33168
I'm not sure if the test should fail, be skipped, or not run
for cygwin < 1.3.13. But let's at least put a comment in
the test and close this bug report. Jason, can you do that?
Good persistence Jason!
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-08-09 19:08
Message:
Logged In: YES
user_id=80475
My thought is to close the bug, but add a unittest that
says in effect: if os is cygwin and version(cygwin) >=
1.3.13 and the bug still exists, then fail with a message
saying that SF 556025 was not successfully resolved.
This way, we can close the bug (since it is not a python
bug) and still get a regression test to raise the concern if
the expected solution either doesn't materialize or
sometime later dematerializes.
----------------------------------------------------------------------
Comment By: Jason Tishler (jlt63)
Date: 2002-08-09 17:02
Message:
Logged In: YES
user_id=86216
I guess that my perseverance paid off:
http://sources.redhat.com/ml/newlib/2002/msg00391.html
Hence, this bug will be fixed in Cygwin 1.3.13.
Can we close this bug now? Or, should we wait until
Cygwin 1.3.13 is released?
----------------------------------------------------------------------
Comment By: Jason Tishler (jlt63)
Date: 2002-08-09 10:09
Message:
Logged In: YES
user_id=86216
Thanks for the sympathy.
I've tried, tried again:
http://sources.redhat.com/ml/newlib/2002/msg00390.html
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-08-08 16:56
Message:
Logged In: YES
user_id=33168
I looked at the links. I don't know what I can do to help.
It seems like you proposed a reasonable solution and even
if it wasn't perfect, you still demonstrated a problem. I
suppose I can only commiserate with you.
----------------------------------------------------------------------
Comment By: Jason Tishler (jlt63)
Date: 2002-08-08 04:11
Message:
Logged In: YES
user_id=86216
> Jason, can you test/replicate this?
Yes, I've already been working on this one. See
the following mailing list threads for the details:
http://cygwin.com/ml/cygwin-developers/2002-07/msg00124.html
http://sources.redhat.com/ml/newlib/2002/msg00369.html
To summarize the above, the problem is actually
in newlib which provides Cygwin's libc (and
libm). Unfortunately, Chris Falyor (the Cygwin
lead developer) has not been able to convince
the newlib maintainer to fix this problem.
Additionally, my first patch has been rejected.
I will continue my efforts to get this problem
resolved. Any assistance will be greatly
appreciated. I never expected to become an
expert in Doug Lea's malloc routines. Sigh...
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-08-07 12:11
Message:
Logged In: YES
user_id=33168
Actually, I think Jason may be more appropriate, since this
is a cygwin problem. Jason, can you test/replicate this?
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-08-07 10:20
Message:
Logged In: YES
user_id=33168
Hmmm, Tim can you reproduce this? I luckily don't have a
windows box. :-)
----------------------------------------------------------------------
Comment By: Steve Holden (holdenweb)
Date: 2002-08-07 09:28
Message:
Logged In: YES
user_id=88157
I hope re-opening this is the right thing to do (I'm new here).
Current CVS fails under Win2000+Cygwin with a
segmentation fault in the updated test_b1.py. Easily
reproduced:
$ ./python.exe
Python 2.3a0 (#1, Aug 7 2002, 12:18:38)
[GCC 2.95.3-5 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> import sys
>>> list(xrange(sys.maxint/4))
Segmentation fault (core dumped)
This does seem to be size-related, as:
>>> s = sys.maxint/8
>>> list(xrange(s))
Traceback (most recent call last):
File "<stdin>", line 1, in ?
MemoryError
which is as expected in test_b1.py
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-22 15:20
Message:
Logged In: YES
user_id=33168
Checked in as:
listobject.c 2.106
test_b1.py 1.46
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-21 04:59
Message:
Logged In: YES
user_id=80475
Good plan! Thx for squashing this bug.
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-21 04:52
Message:
Logged In: YES
user_id=33168
How about using sys.maxint / 4? Does that make more sense
than 1e9? This assumption is a little better, that the data
and address sizes are the same. I can add a comment to this
effect.
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-20 19:37
Message:
Logged In: YES
user_id=80475
When you load the fix, please commit the regression test
patch also. Thx, Raymond
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-05-20 18:51
Message:
Logged In: YES
user_id=31435
Yup, lookin' better. Python does assume 8-bit bytes in
several places, and also 2's-complement integers. Since
size_t is guaranteed (by C) to be an unsigned type, the
largest value of type size_t is more easily expressed as
(~(size_t)0)
The C part of the patch looks fine then. The test is a
little dubious: who says the machine can't create a
billion-integer list? The idea that 1e9 necessarily
overflows in this context is a 32-bit address-space
assumption. But I'm willing to delay fixing that until a
machine with a usable larger address space appears <wink>.
So marked Accepted and assigned to you for checkin. Thanks!
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-20 07:43
Message:
Logged In: YES
user_id=33168
Ok, there were other problems, too:
* Need to divide by the size of the type,
not >> 4 which was completely broken.
* There was a missing PyErr_NoMemory().
I uploaded a new patch.
I'm not sure the size_t fix is correct.
I hope we can at least assume 8-bit machines: :-)
if (_new_size <= ((1 << (sizeof(size_t)*8 - 1)) / sizeof(type)))
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-05-19 20:48
Message:
Logged In: YES
user_id=31435
The code patch has some problems: you can't assume any
relation between a size_t and an unsigned long; C simply
doesn't define how big size_t is, and relative sizes do
vary on 64-bit platforms. However that gets fixed, if you
decide it's "too big", var should be set to NULL (not 0 --
this is a "Guido thing" <wink>), and no exception should be
set. It's the caller's responsibility to check var for
NULL after the macro is invoked, and set an appropriate
exception. listobject.c sometimes doesn't check the result
for NULL, but that should only be when it knows it's
*shrinking* a memory area, so that realloc can't fail.
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-17 16:12
Message:
Logged In: YES
user_id=80475
Added a patch to add this bug to the testbank.
Shallow code review: Patch compiles okay (applied to
Py2.3a0). Fixes the original problem. Passes the smell
test. Macro style good (only the "var" operand is used
more than once; no side-effects except setting "var" to
zero upon a resize error). Passes the standard regression
tests. Passes regression testing versus my personal (real
code) testbank.
Will give it a deeper look this week-end.
One other thought: should the ValueError be replaced with a
MemoryError to make the message consistent with PyList_New?
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-17 13:03
Message:
Logged In: YES
user_id=33168
Ok, this time I have a patch. The patch only fixes listobject.
I looked over the other uses of PyMem_R{ESIZE,ALLOC}() and
they don't appear to be nearly as problematic as list. For
example, if the grammar has 1e9 nodes, there are going to be
other problems well before then (ie, memory will probably be
gone).
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-17 12:25
Message:
Logged In: YES
user_id=33168
Oops, sorry about that last comment. That was something I
was playing with. The CVS version is fine for [x]range(float).
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-05-17 12:22
Message:
Logged In: YES
user_id=33168
Note, this problem is even more generic in CVS version:
>>> range(1.1)
Segmentation fault (core dumped)
>>> xrange(1.1)
Segmentation fault (core dumped)
[x]xrange(float) work fine in 2.2.1.
----------------------------------------------------------------------
Comment By: Raymond Hettinger (rhettinger)
Date: 2002-05-14 16:44
Message:
Logged In: YES
user_id=80475
Would there be some merit to converting PyMem_RESIZE to a
function with an overflow check?
I would guess that the time spent on a realloc dwarfs the
overhead of a short wrapper function.
OTOH, I'll bet this core dump only occurs in toy examples
and never in real code.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-05-14 14:15
Message:
Logged In: YES
user_id=31435
Heh. This is an instance of a general flaw: the
PyMem_RESIZE macro doesn't check for int overflow in its
(n) * sizeof(type)
subexpression. The basic deal is that 1000000000 fits in an
int, but 4 times that (silently) overflows. In more
detail, for this specific test case, listobject.c'.s
roundupsize rounds 1e9 up to 0x40000000, which silently
underflows to 0!() when PyMem_RESIZE multiplies it by 4.
Hard to know how to fix this in general; PyMem_RESIZE
callers don't generally worry about overflow now.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=556025&group_id=5470