Re: [Python-Dev] stack check on Unix: any suggestions?

Does anyone have suggestions for how to detect unbounded recursion in the Python core on Unix platforms?
I just submitted patch 101352, at http://sourceforge.net/patch/?func=detailpatch&patch_id=101352&group_id=5470 This patch works on the realistic assumption that reliable stack usage is not available through getrusage on most systems, so it uses an estimate instead. The upper stack boundary is determined on thread creation; the lower stack boundary inside the check. This must allow for initial stack frames (main, _entry, etc), and for pages allocated by on the stack by the system. At least on Linux, argv and env pages count towards the stack limit. If some systems are known to return good results from getrusage, that should be used instead. I have tested this patch on a Linux box to detect recursion in both the example of bug 112943, as well as the foo() recursion; the latter would crash with stock CVS python only when I reduced the stack limit from 8MB to 1MB. Since the patch uses a heuristic to determine stack exhaustion, it is probably possible to find cases where it does not work. I.e. it might diagnose exhaustion, where it could run somewhat longer (rather, deeper), or it fails to diagnose exhaustion when it is really out of stack. It is also likely that there are better heuristics. Overall, I believe this patch is an improvement. While this patch claims to support all of Unix, it only works where getrlimit(RLIMIT_STACK) works. Unix(tm) does guarantee this API; it should work on *BSD and many other Unices as well. Comments? Martin

"Martin v. Loewis" wrote:
Does anyone have suggestions for how to detect unbounded recursion in the Python core on Unix platforms?
I just submitted patch 101352, at
http://sourceforge.net/patch/?func=detailpatch&patch_id=101352&group_id=5470
This patch works on the realistic assumption that reliable stack usage is not available through getrusage on most systems, so it uses an estimate instead. The upper stack boundary is determined on thread creation; the lower stack boundary inside the check. This must allow for initial stack frames (main, _entry, etc), and for pages allocated by on the stack by the system. At least on Linux, argv and env pages count towards the stack limit.
If some systems are known to return good results from getrusage, that should be used instead.
I have tested this patch on a Linux box to detect recursion in both the example of bug 112943, as well as the foo() recursion; the latter would crash with stock CVS python only when I reduced the stack limit from 8MB to 1MB.
Since the patch uses a heuristic to determine stack exhaustion, it is probably possible to find cases where it does not work. I.e. it might diagnose exhaustion, where it could run somewhat longer (rather, deeper), or it fails to diagnose exhaustion when it is really out of stack. It is also likely that there are better heuristics. Overall, I believe this patch is an improvement.
While this patch claims to support all of Unix, it only works where getrlimit(RLIMIT_STACK) works. Unix(tm) does guarantee this API; it should work on *BSD and many other Unices as well.
Comments?
See my comments in the patch manager... the patch looks fine except for two things: getrlimit() should be tested for usability in the configure script and the call frequency of PyOS_CheckStack() should be lowered to only use it for potentially recursive programs. Apart from that, this looks like the best alternative so far :-) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

mal wrote:
See my comments in the patch manager... the patch looks fine except for two things: getrlimit() should be tested for usability in the configure script and the call frequency of PyOS_CheckStack() should be lowered to only use it for potentially recursive programs.
the latter would break windows and mac versions of Python, where Python can run on very small stacks (not to mention embedded systems...) for those platforms, CheckStack is designed to work with an 8k safety margin (PYOS_STACK_MARGIN) ::: one way to address this is to introduce a scale factor, so that you can add checks based on the default 8k limit, but auto- magically apply them less often platforms where the safety margin is much larger... /* checkstack, but with a "scale" factor */ #if windows or mac /* default safety margin */ #define PYOS_CHECKSTACK(v, n)\ (((v) % (n) == 0) && PyOS_CheckStack()) #elif linux /* at least 10 times the default safety margin */ #define PYOS_CHECKSTACK(v, n)\ (((v) % ((n)*10) == 0) && PyOS_CheckStack()) #endif if (PYOS_CHECKSTACK(tstate->recursion_depth, 10) ... </F>

Fredrik Lundh wrote:
mal wrote:
See my comments in the patch manager... the patch looks fine except for two things: getrlimit() should be tested for usability in the configure script and the call frequency of PyOS_CheckStack() should be lowered to only use it for potentially recursive programs.
the latter would break windows and mac versions of Python, where Python can run on very small stacks (not to mention embedded systems...)
for those platforms, CheckStack is designed to work with an 8k safety margin (PYOS_STACK_MARGIN)
Ok, I don't mind calling it every ten levels deep, but I'd rather not have it start at level 0. The reason is that many programs probably don't make much use of recursion anyway and have a maximum call depth of around 10-50 levels (Python programs usually using shallow class hierarchies). These programs should not be bothered by calling PyOS_CheckStack() all the time. Recursive programs will easily reach the 100 mark -- those should call PyOS_CheckStack often enough to notice the stack problems. So the check would look something like this: if (tstate->recursion_depth >= 50 && tstate->recursion_depth%10 == 0 && PyOS_CheckStack()) { PyErr_SetString(PyExc_MemoryError, "Stack overflow"); return NULL; }
:::
one way to address this is to introduce a scale factor, so that you can add checks based on the default 8k limit, but auto- magically apply them less often platforms where the safety margin is much larger...
/* checkstack, but with a "scale" factor */ #if windows or mac /* default safety margin */ #define PYOS_CHECKSTACK(v, n)\ (((v) % (n) == 0) && PyOS_CheckStack()) #elif linux /* at least 10 times the default safety margin */ #define PYOS_CHECKSTACK(v, n)\ (((v) % ((n)*10) == 0) && PyOS_CheckStack()) #endif
if (PYOS_CHECKSTACK(tstate->recursion_depth, 10) ...
I'm not exactly sure how large the safety margin is with Martin's patch, but this seems a good idea. -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

So the check would look something like this:
if (tstate->recursion_depth >= 50 && tstate->recursion_depth%10 == 0 && PyOS_CheckStack()) { PyErr_SetString(PyExc_MemoryError, "Stack overflow"); return NULL; }
That sounds like a good solution to me. A recursion depth of 50 should be guaranteed on most systems supported by Python.
I'm not exactly sure how large the safety margin is with Martin's patch, but this seems a good idea.
I chose 3% of the rlimit, which must accomodate the space above the known start of stack plus a single page. That number was chosen arbitarily; on my Linux system, the stack limit is 8MB, so 3% give 200k. Given the maximum limitation of environment pages and argv pages, I felt that this is safe enough. OTOH, if you've used more than 7MB of stack, it is likely that the last 200k won't help, either. Regards, Martin

"Martin v. Loewis" wrote:
So the check would look something like this:
if (tstate->recursion_depth >= 50 && tstate->recursion_depth%10 == 0 && PyOS_CheckStack()) { PyErr_SetString(PyExc_MemoryError, "Stack overflow"); return NULL; }
That sounds like a good solution to me. A recursion depth of 50 should be guaranteed on most systems supported by Python.
Jeremy: Could get at least this patch into 2.0b1 ?
I'm not exactly sure how large the safety margin is with Martin's patch, but this seems a good idea.
I chose 3% of the rlimit, which must accomodate the space above the known start of stack plus a single page. That number was chosen arbitarily; on my Linux system, the stack limit is 8MB, so 3% give 200k. Given the maximum limitation of environment pages and argv pages, I felt that this is safe enough. OTOH, if you've used more than 7MB of stack, it is likely that the last 200k won't help, either.
Looks like I don't have any limits set on my dev-machine... Linux has no problems offering me 3GB of (virtual) stack space even though it only has 64MB real memory and 200MB swap space available ;-) I guess the proposed user settable recursion depth limit is the best way to go. Testing for the right limit is rather easy by doing some trial and error processing using Python. At least for my Linux installation a limit of 9000 seems reasonable. Perhaps everybody on the list could do a quick check on their platform ? Here's a sample script: i = 0 def foo(x): global i print i i = i + 1 foo(x) foo(None) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Thu, Aug 31, 2000 at 02:33:28PM +0200, M.-A. Lemburg wrote:
At least for my Linux installation a limit of 9000 seems reasonable. Perhaps everybody on the list could do a quick check on their platform ?
On BSDI, which has a 2Mbyte default stack limit (but soft limit: users can set it higher even without help from root, and much higher with help) I can go as high as 8k recursions of the simple python-function type, and 5k recursions of one involving a C call (like a recursive __str__()). I don't remember ever seeing a system with less than 2Mbyte stack, except for seriously memory-deprived systems. I do know that the 2Mbyte stacklimit on BSDI is enough to cause 'pine' (sucky but still popular mailprogram) much distress when handling large mailboxes, so we usually set the limit higher anyway. Mutt-forever-ly y'rs, -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

Thomas Wouters wrote:
On Thu, Aug 31, 2000 at 02:33:28PM +0200, M.-A. Lemburg wrote:
At least for my Linux installation a limit of 9000 seems reasonable. Perhaps everybody on the list could do a quick check on their platform ?
On BSDI, which has a 2Mbyte default stack limit (but soft limit: users can set it higher even without help from root, and much higher with help) I can go as high as 8k recursions of the simple python-function type, and 5k recursions of one involving a C call (like a recursive __str__()).
Ok, this give us a 5000 limit as default... anyone with less ;-) (Note that with the limit being user settable making a lower limit the default shouldn't hurt anyone.) -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

On Thu, Aug 31, 2000 at 03:58:55PM +0200, M.-A. Lemburg wrote:
Thomas Wouters wrote:
On BSDI, which has a 2Mbyte default stack limit (but soft limit: users can set it higher even without help from root, and much higher with help) I can go as high as 8k recursions of the simple python-function type, and 5k recursions of one involving a C call (like a recursive __str__()).
Ok, this give us a 5000 limit as default... anyone with less ;-)
I would suggest going for something a lot less than 5000, tho, to account for 'large' frames. Say, 2000 or so, max.
(Note that with the limit being user settable making a lower limit the default shouldn't hurt anyone.)
Except that it requires yet another step ... ;P It shouldn't hurt anyone if it isn't *too* low. However, I have no clue how 'high' it would have to be for, for instance, Zope, or any of the other 'large' Python apps. -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

On Thu, Aug 31, 2000 at 02:33:28PM +0200, M.-A. Lemburg wrote:
... At least for my Linux installation a limit of 9000 seems reasonable. Perhaps everybody on the list could do a quick check on their platform ?
Here's a sample script:
i = 0 def foo(x): global i print i i = i + 1 foo(x)
foo(None)
10k iterations on my linux box -g -- Greg Stein, http://www.lyra.org/

"GS" == Greg Stein <gstein@lyra.org> writes:
GS> 10k iterations on my linux box 9143 on mine. I'll note that Emacs has a similar concept, embodied in max-lisp-eval-depth. The documentation for this variable clearly states that its purpose is to avoid infinite recursions that would overflow the C stack and crash Emacs. On my XEmacs 21.1.10, max-lisp-eval-depth is 500. Lisp tends to be more recursive than Python, but it's also possible that there are fewer ways to `hide' lots of C stack between Lisp function calls. So random.choice(range(500, 9143)) seems about right to me <wink>. -Barry

Here's a sample script:
i = 0 def foo(x): global i print i i = i + 1 foo(x)
foo(None)
Please try this again on various platforms with this version: i = 0 class C: def __getattr__(self, name): global i print i i += 1 return self.name # common beginners' mistake C() # This tries to get __init__, triggering the recursion I get 5788 iterations on Red Hat Linux 6.2 (ulimit -c says 8192; I have no idea what units). --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)

That would be Kb. But -c controls core-file size, not stack. You wanted -s. ulimit -a shows all the limits.
Typo. I did use ulimit -s. ulimit -a confirms that it's 8192 kbytes. --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)

Guido van Rossum wrote:
Here's a sample script:
i = 0 def foo(x): global i print i i = i + 1 foo(x)
foo(None)
Please try this again on various platforms with this version:
i = 0 class C: def __getattr__(self, name): global i print i i += 1 return self.name # common beginners' mistake
C() # This tries to get __init__, triggering the recursion
I get 5788 iterations on Red Hat Linux 6.2 (ulimit -c says 8192; I have no idea what units).
8192 refers to kB, i.e. 8 MB. I get 6053 on SuSE Linux 6.2 without resource stack limit set. Strange enough if I put the above inside a script, the class isn't instantiated. The recursion only starts when I manually trigger C() in interactive mode or do something like 'print C()'. Is this a bug or a feature ? -- Marc-Andre Lemburg ______________________________________________________________________ Business: http://www.lemburg.com/ Python Pages: http://www.lemburg.com/python/

Please try this again on various platforms with this version:
i = 0 class C: def __getattr__(self, name): global i print i i += 1 return self.name # common beginners' mistake
C() # This tries to get __init__, triggering the recursion
I get 5788 iterations on Red Hat Linux 6.2 (ulimit -c says 8192; I have no idea what units).
8192 refers to kB, i.e. 8 MB.
I get 6053 on SuSE Linux 6.2 without resource stack limit set.
Strange enough if I put the above inside a script, the class isn't instantiated. The recursion only starts when I manually trigger C() in interactive mode or do something like 'print C()'. Is this a bug or a feature ?
Aha. I was wrong -- it's happening in repr(), not during construction. --Guido van Rossum (home page: http://www.pythonlabs.com/~guido/)

Guido van Rossum writes:
Please try this again on various platforms with this version:
i = 0 class C: def __getattr__(self, name): global i print i i += 1 return self.name # common beginners' mistake
C() # This tries to get __init__, triggering the recursion
I get 5788 iterations on Red Hat Linux 6.2 (ulimit -c says 8192; I have no idea what units).
I get a core dump after 4824 iterations on a not-quite-Red-Hat box, with an 8MB stack limit. What about the idea that was suggested to use a sigsegv catcher? Or reading info from /proc (yes, there is a lot of overhead here, but if we do in infrequently enough we might just get away with it. It could be a configure-time option disable by default). I still think there are even more tricks possible here, and we should pursue this after 2.0b1. I volunteer to help work on it ;-)

Guido van Rossum wrote:
Please try this again on various platforms with this version:
i = 0 class C: def __getattr__(self, name): global i print i i += 1 return self.name # common beginners' mistake
C() # This tries to get __init__, triggering the recursion
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Are you sure? Although strange, this is not the case and instantiating C succeeds (try "python rec.py", where rec.py is the above code). A closer look at the code shows that Instance_New goes on calling getattr2 which calls class_lookup, which returns NULL, etc, etc, but the presence of __getattr__ is not checked in this path. -- Vladimir MARANGOZOV | Vladimir.Marangozov@inrialpes.fr http://sirac.inrialpes.fr/~marangoz | tel:(+33-4)76615277 fax:76615252

On Thu, Aug 31, 2000 at 10:58:49AM -0500, Guido van Rossum wrote:
C() # This tries to get __init__, triggering the recursion
I get 5788 iterations on Red Hat Linux 6.2 (ulimit -c says 8192; I have no idea what units).
That's odd... On BSDI, with a 2Mbyte stacklimit (ulimit -s says 2048) I get almost as many recursions: 5136. That's very much not what I would expect... With a stack limit of 8192, I can go as high as 19997 recursions! I wonder why that is... Wait a minute... The Linux SEGV isn't stacksize related at all! Observe: centurion:~ > limit stacksize 8192 centurion:~ > python teststack.py | tail -3 5134 5135 5136 Segmentation fault (core dumped) centurion:~ > limit stacksize 65536 centurion:~ > python teststack.py | tail -3 5134 5135 5136 Segmentation fault (core dumped) centurion:~ > limit stacksize 2048 centurion:~ > python teststack.py | tail -3 5134 5135 5136 Segmentation fault (core dumped) centurion:~ > limit stacksize 128 centurion:~ > python teststack.py | tail -3 Segmentation fault (core dumped) centurion:~ > limit stacksize 1024 centurion:~ > python teststack.py | tail -3 2677 2678 26Segmentation fault (core dumped) centurion:~ > limit stacksize 1500 centurion:~ > python teststack.py | tail -3 3496 3497 349Segmentation fault (core dumped) I don't have time to pursue this, however. I'm trying to get my paid work finished tomorrow, so that I can finish my *real* work over the weekend: augassign docs & some autoconf changes :-) -- Thomas Wouters <thomas@xs4all.net> Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
participants (9)
-
bwarsaw@beopen.com
-
Charles G Waldman
-
Fredrik Lundh
-
Greg Stein
-
Guido van Rossum
-
M.-A. Lemburg
-
Martin v. Loewis
-
Thomas Wouters
-
Vladimir.Marangozov@inrialpes.fr