Trap SIGSEGV and SIGFPE
Hi, I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example: try: call_evil_code() except MemoryError: print "A segfault? Haha, I don't care!" print "continue the execution" (yes, it's possible to continue the execution after a segmentation fault!) Handled errors: - Segmentation fault: * invalid memory read * invalid memory write * stack overflow (stack pointer outside the stack memory) - SIGFPE * division by zero * floating point error? Such errors may occurs from external libraries (written in C)... or Python builtin libraries (eg. imageop). The handler is now only used in Py_EvalFrameEx(), but it could be used anywhere. The patch uses sigsetjmp() in Py_EvalFrameEx() to set a "check point", and siglongjmp() in the signal handler to go back to the check point. It also uses a separated stack for the signal handler, because on stack overflow you can not use the stack (ex: unable to call any function!). With MAXDEPTH=100, the memory footprint is ~20 KB. If you call Py_EvalFrameEx() more than MAXDEPTH times, the handler will go back to the frame #MAXDEPTH on error (you loose the last entries in the Python traceback). sigsetjmp()/siglongjmp() should be available on many OS. I just know that it works perfectly on Linux. sigaltstack() is needed to recover after a stack overflow, but other errors can be catched without it. I didn't run any benchmark yet, but it would be interresting ;-) Changing MAXDEPTH constant may changes the speed with many recursive calls (eg. MAXDEPTH=1 only set a check for the first call to Py_EvalFrameEx()). I would appreciate a review, especially for the patch in Python/ceval.c. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/
On Wed, Dec 10, 2008 at 4:06 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Hi,
I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example:
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless. sigsetjmp() is only safe for code explicitly designed for it. That will never be the case for CPython, let alone all the arbitrary libraries that may be used with it. -- Adam Olsen, aka Rhamphoryncus
Oh, I forgot the issue URL: http://bugs.python.org/issue3999 I also attached an example of catching segfaults.
I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example:
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless.
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug! I didn't try on real world application, but with a small script the program continues its execution without any problem. But yes, there is a big risk of: - leak memory - deadlock - context problem, eg. for the GIL, I call PyGILState_Ensure() - etc. I choosed the exceptions MemoryError and ArithmeticError, but we could use specific exceptions based on BaseException instead of Exception to avoid catching them with "except Exception: ...". -- Victor Stinner aka haypo http://www.haypocalc.com/blog/
On Wed, Dec 10, 2008 at 12:37 PM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Oh, I forgot the issue URL: http://bugs.python.org/issue3999
I also attached an example of catching segfaults.
I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example:
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless.
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
Exactly! That's why it doesn't belong in the Python core. We can't guarantee anything about its affects or encourage it.
I didn't try on real world application, but with a small script the program continues its execution without any problem.
But as you say, it would be used on real world programs! -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner."
Benjamin Peterson wrote:
On Wed, Dec 10, 2008 at 12:37 PM, Victor Stinner
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless. Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
Exactly! That's why it doesn't belong in the Python core. We can't guarantee anything about its affects or encourage it.
Would it be safe to catch SIGSEGV, output a trace, and then exit? IE, make the 'first goal' the only goal?
Le Wednesday 10 December 2008 20:04:00 Terry Reedy, vous avez écrit :
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
Exactly! That's why it doesn't belong in the Python core. We can't guarantee anything about its affects or encourage it.
Would it be safe to catch SIGSEGV, output a trace, and then exit? IE, make the 'first goal' the only goal?
Oh yeah, good idea :-) Does it mean that Python interpreter can't be used to display the trace? It would be nice to -at least- use the Python stderr (which is written in pure Python for Python3). It would be better if the user can setup a callback, like sys.excepthook. But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-) I guess that sigsetjmp() and siglongjmp() hack can be avoided in Py_EvalFrameEx(), so ceval.c could be unchanged. New pseudocode: set checkpoint if error: get the backtrace display the backtrace fast exit (eg. don't call atexit, don't free memory, ...) else: normal execution -- Victor Stinner aka haypo http://www.haypocalc.com/blog/
On Thu, Dec 11, 2008 at 1:34 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-)
While it's true that after a segfault or unexpected longjmp, there are no guarantees whatsoever about the state of the python program, the program will often just happen to work, and there are at least some programs I've worked on that would rather take the risk in order to try to shut down gracefully. For example, an interactive app may want to give the user a chance to save her (not necessarily corrupted) work into a new file rather than unconditionally losing it. Or a webserver might want to catch the segfault, finish replying to the other requests that were in progress at the time, maybe reply to the request that caused the segfault, and then restart. Yes there's a possibility that the events around the segfault exposed some secret internal data (and they may do so even without segfaulting), but when the alternative is not replying to the users at all, this may be a risk the app wants to take. It would be nice for Python to at least expose the option so that developers (who are consenting adults, remember) can make their own decisions. It should _not_ be on by default, but something like sys.dangerous_turn_C_crashes_into_exceptions() would be useful. Jeffrey
On Thu, Dec 11, 2008 at 10:08 AM, Jeffrey Yasskin <jyasskin@gmail.com> wrote:
On Thu, Dec 11, 2008 at 1:34 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-)
While it's true that after a segfault or unexpected longjmp, there are no guarantees whatsoever about the state of the python program, the program will often just happen to work, and there are at least some programs I've worked on that would rather take the risk in order to try to shut down gracefully. For example, an interactive app may want to give the user a chance to save her (not necessarily corrupted) work into a new file rather than unconditionally losing it. Or a webserver might want to catch the segfault, finish replying to the other requests that were in progress at the time, maybe reply to the request that caused the segfault, and then restart. Yes there's a possibility that the events around the segfault exposed some secret internal data (and they may do so even without segfaulting), but when the alternative is not replying to the users at all, this may be a risk the app wants to take. It would be nice for Python to at least expose the option so that developers (who are consenting adults, remember) can make their own decisions. It should _not_ be on by default, but something like sys.dangerous_turn_C_crashes_into_exceptions() would be useful.
Trying to recover (or save work etc.) is incredibility unpredictable, though. It could very well end up making the situation worse! I'm -1 on putting this in the core. -- Cheers, Benjamin Peterson "There's nothing quite as beautiful as an oboe... except a chicken stuck in a vacuum cleaner."
On Dec 11, 2008, at 11:08 AM, Jeffrey Yasskin wrote:
On Thu, Dec 11, 2008 at 1:34 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-)
While it's true that after a segfault or unexpected longjmp, there are no guarantees whatsoever about the state of the python program, the program will often just happen to work, and there are at least some programs I've worked on that would rather take the risk in order to try to shut down gracefully.
I ran an interactive game for years (written in C, mind you, not python), where the SIGSEGV handler simply recursively reinvoked the main loop, after disabling the command that caused a SEGV if it had caused a SEGV twice already. It almost always worked and continued running without issue. YMMV, of course. :) James
On Thu, Dec 11, 2008 at 2:34 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Le Wednesday 10 December 2008 20:04:00 Terry Reedy, vous avez écrit :
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
Exactly! That's why it doesn't belong in the Python core. We can't guarantee anything about its affects or encourage it.
Would it be safe to catch SIGSEGV, output a trace, and then exit? IE, make the 'first goal' the only goal?
Oh yeah, good idea :-) Does it mean that Python interpreter can't be used to display the trace? It would be nice to -at least- use the Python stderr (which is written in pure Python for Python3). It would be better if the user can setup a callback, like sys.excepthook. But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-)
You have to use the low-level stderr, nothing that invokes Python. I'd hate to get a second segfault while printing the first. Just think about how indirect refcounting bugs tend to be. Another example is messing up GIL handling. There's heaps of things for which we'd want good stack traces, which can't be done from Python. -- Adam Olsen, aka Rhamphoryncus
On Thu, Dec 11, 2008 at 12:15 PM, Adam Olsen <rhamph@gmail.com> wrote:
You have to use the low-level stderr, nothing that invokes Python. I'd hate to get a second segfault while printing the first.
Just think about how indirect refcounting bugs tend to be. Another example is messing up GIL handling. There's heaps of things for which we'd want good stack traces, which can't be done from Python.
+1 on functionality to print a stack trace on a fault -1 on translating the fault into an exception I suggest exposing some functions to control the functionality. Here are some things the user may wish to control: 1. Disable/enable the functionality altogether 2. Set the file descriptor that the stack trace should be written to 3. Set a file name that should be created and written to instead 4. Specify whether a core dump should be generated 5. Specify a program to run after the stack trace has been printed #3 combined with #5 would be very useful for automated bug reporting. For what it's worth, the functionality could be implemented under Windows using Structured Exception Handling. -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
On 2008-12-11 19:15, Adam Olsen wrote:
On Thu, Dec 11, 2008 at 2:34 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Le Wednesday 10 December 2008 20:04:00 Terry Reedy, vous avez écrit :
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug! Exactly! That's why it doesn't belong in the Python core. We can't guarantee anything about its affects or encourage it. Would it be safe to catch SIGSEGV, output a trace, and then exit? IE, make the 'first goal' the only goal? Oh yeah, good idea :-) Does it mean that Python interpreter can't be used to display the trace? It would be nice to -at least- use the Python stderr (which is written in pure Python for Python3). It would be better if the user can setup a callback, like sys.excepthook. But if -as many people wrote- Python is totally broken after a segfault, it is maybe not a good idea :-)
You have to use the low-level stderr, nothing that invokes Python. I'd hate to get a second segfault while printing the first.
Just think about how indirect refcounting bugs tend to be. Another example is messing up GIL handling. There's heaps of things for which we'd want good stack traces, which can't be done from Python.
Experience with mx.Tools.safecall() shows that there's a lot you can still do after a segfault in some library, including print the traceback in Python, so things are not as bad. However, I'd disable such functionality in Python per default, if it should ever get introduced. This has got to stay an expert option, unless we want to risk messing up user systems completely, e.g. by having some logging manager unintentionally overwrite important files on the disk. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 11 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2008-12-02: Released mxODBC.Connect 1.0.0 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
On Wed, Dec 10, 2008 at 11:37 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Oh, I forgot the issue URL: http://bugs.python.org/issue3999
I also attached an example of catching segfaults.
I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example:
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless.
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
It's possible to print the Python stack purely from C, without invoking any Python code. Even better, you could print the C stack while you're at it! Doing that in a signal handler, and then killing the process, could be seriously considered. Take a look at http://www.linuxjournal.com/article/6391 . You'll probably need #ifdef's to only use it on certain supported platforms, and probably disable it by default anyway (configure option? Not sure). Still, it'd be useful to have it there. -- Adam Olsen, aka Rhamphoryncus
On Wed, Dec 10, 2008 at 8:37 PM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Recover after a segfault is dangerous, but my first goal was to get the Python backtrace instead just one line: "Segmentation fault". It helps a lot for debug!
This would be extremely useful. I've had PyGTK segfault on me a number of times in an app I'm writing and I keep meaning to try get to the bottom of the issue but it happens infrequently and somehow I never get around to it. Some indictation of what Python was executing when the segfault occurred would help narrow now the possibilities rapidly. Schiavo Simon
Simon> Some indictation of what Python was executing when the segfault Simon> occurred would help narrow now the possibilities rapidly. The Python distribution comes with a Misc/gdbinit file (you can grab it from the Subversion source tree via the web as well) that defines a pystack command. It will work with core files as well as running processes and should give you a very good idea where your Python code was executing when the segfault occurred. -- Skip Montanaro - skip@pobox.com - http://smontanaro.dyndns.org/
<skip <at> pobox.com> writes:
The Python distribution comes with a Misc/gdbinit file (you can grab it from the Subversion source tree via the web as well) that defines a pystack command. It will work with core files as well as running processes and should give you a very good idea where your Python code was executing when the segfault occurred.
Still, it would be much better if the stack trace could be printed by Python itself rather than having to resort to gdb wizardry. Especially if the problem is reported by one of your non-developer users.
Antoine> Still, it would be much better if the stack trace could be Antoine> printed by Python itself rather than having to resort to gdb Antoine> wizardry. Especially if the problem is reported by one of your Antoine> non-developer users. I understand. The guy has a problem today for which there is a solution that I posted. If he's "been meaning to look into the problem" and he's posting to python-dev I presume he knows at least a little about running gdb if he's operating in a Unix environment. These two gdb commands source .gdbinit pystack shouldn't be too much of a barrier. Skip
<skip <at> pobox.com> writes:
I understand. The guy has a problem today for which there is a solution that I posted. If he's "been meaning to look into the problem" and he's posting to python-dev I presume he knows at least a little about running gdb if he's operating in a Unix environment. These two gdb commands
source .gdbinit pystack
shouldn't be too much of a barrier.
Well, but sometimes you don't have a core file (because you didn't run ulimit before launching Python and the crash wasn't expected; if the crash is very erratic, by the time you've fixed the system limits, you don't manage to reproduce it anymore, or it takes hours because it's at the end of a very long workload). Sometimes you don't have the gdbinit file around (for example, Mandriva doesn't ship it with any Python-related package). Sometimes you are under Windows. etc. :-)
Le Thursday 11 December 2008 13:57:03 skip@pobox.com, vous avez écrit :
Simon> Some indictation of what Python was executing when the segfault Simon> occurred would help narrow now the possibilities rapidly.
The Python distribution comes with a Misc/gdbinit file
Hum, do you really run *all* programs in gdb? Most of the time, you don't expect a crash (because you trust your softwares). You will have to try to reproduce the crash, but sometimes it's very hard (eg. Heisenbugs!). My new proposition is to display the backtrace instead of just the message "segmentation fault". It's not a problem if displaying the backtrace produces new fault because it's already better than just the message "segmentation fault". Even with my SIGSEVG handler, you can still use gdb because gdb catchs the signal before the program. -- Victor Stinner aka haypo http://www.haypocalc.com/blog/
>> The Python distribution comes with a Misc/gdbinit file Victor> Hum, do you really run *all* programs in gdb? Most of the time, Victor> you don't expect a crash (because you trust your softwares). You Victor> will have to try to reproduce the crash, but sometimes it's very Victor> hard (eg. Heisenbugs!). Please folks! Get real. I was trying to help out a guy who responded to this thread saying that he gets intermittent segfaults in his PyGTK programs. I don't presume that he runs his app in gdb. If he has a core file this will work. I apologize profusely for any implication that a set of gdb commands is in any way superior to your patch. OTOH, it works today if you have a core file and are running Python at least as far back as 2.4. It doesn't require any changes to the interpreter. I use it frequently at work (a couple times a month anyway). We get notifications of all core files dropped each day. I make at least a cursory check of all core files dumped by Python. For that I use the pystack command defined in Misc/gdbinit. Victor> My new proposition is to display the backtrace instead of just Victor> the message "segmentation fault". It's not a problem if Victor> displaying the backtrace produces new fault because it's already Victor> better than just the message "segmentation fault". Even with my Victor> SIGSEVG handler, you can still use gdb because gdb catchs the Victor> signal before the program. Again, I meant no disrespect to your proposal. I was *simply trying to help the guy out*. Skip
The Python distribution comes with a Misc/gdbinit file
Hum, do you really run *all* programs in gdb? Most of the time, you don't expect a crash (because you trust your softwares). You will have to try to reproduce the crash, but sometimes it's very hard (eg. Heisenbugs!).
You don't have to run the program in gdb. You can also use the core dump that the operating system will generate, and study the crash after it happened. Regards, Martin
One thing i think it would be useful for in the real world is for unittesting extension modules. You cant profitably write unit tests for segfaults because that breaks the test harness. In situations like those, recovering would be likely (caveat emptor of course). 2008/12/10, Adam Olsen <rhamph@gmail.com>:
On Wed, Dec 10, 2008 at 4:06 AM, Victor Stinner <victor.stinner@haypocalc.com> wrote:
Hi,
I published a new version of my fault handler: it installs an handler for signals SIGFPE and SIGSEGV. Using it, it's possible to catch them and continue the execution of your Python program. Example:
This will of course leave the program in an undefined state. It is very likely to crash again, emit garbage, hang, or otherwise be useless.
sigsetjmp() is only safe for code explicitly designed for it. That will never be the case for CPython, let alone all the arbitrary libraries that may be used with it.
-- Adam Olsen, aka Rhamphoryncus _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/bjourne%40gmail.com
-- mvh Björn
On Wed, Dec 10, 2008 at 12:22 PM, BJörn Lindqvist <bjourne@gmail.com> wrote:
One thing i think it would be useful for in the real world is for unittesting extension modules. You cant profitably write unit tests for segfaults because that breaks the test harness. In situations like those, recovering would be likely (caveat emptor of course).
The only safe option there is a subprocess. -- Adam Olsen, aka Rhamphoryncus
On 2008-12-10 21:05, Adam Olsen wrote:
On Wed, Dec 10, 2008 at 12:22 PM, BJörn Lindqvist <bjourne@gmail.com> wrote:
One thing i think it would be useful for in the real world is for unittesting extension modules. You cant profitably write unit tests for segfaults because that breaks the test harness. In situations like those, recovering would be likely (caveat emptor of course).
The only safe option there is a subprocess.
True, but that still makes it a little difficult to report the errors found in the module. mxTools has an optional safecall() function that allows calling functions which potentially segfault and still returns control back to the calling application: http://www.egenix.com/products/python/mxBase/mxTools/ It's not (yet) documented, but fairly straight forward to use once you've enabled it in egenix_mx_base.py: result = mx.Tools.safecall(callable, args, kws) Using such a function is handy in situations where you have a multi-process application setup that sometimes needs to call out to external libraries of varying quality - a situation that's not uncommon in real-life situations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Dec 10 2008)
Python/Zope Consulting and Support ... http://www.egenix.com/ mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2008-12-02: Released mxODBC.Connect 1.0.0 http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! :::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/
I would appreciate a review, especially for the patch in Python/ceval.c.
In this specific case, it is not clear for what objective you want such review. For inclusion into Python? Several people already said (essentially) that: -1. I don't think such code should be added to the Python core, no matter how smart or correct it is. Regards, Martin
On Wed, Dec 10, 2008 at 6:12 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
I would appreciate a review, especially for the patch in Python/ceval.c.
In this specific case, it is not clear for what objective you want such review. For inclusion into Python?
Even if it does not result in an inclusion into Python, I personally would be quite interested in following this thread if discussion of Victor's patch continues. It may quite possibly yield some improvements to python development tools (core and libraries' development). Graceful handling of hard errors is an unsolved problem in Python and it has become more important since ctypes made it to the standard library and therefore it has become possible to easily trigger a hard error from pure python code.
Several people already said (essentially) that: -1. I don't think such code should be added to the Python core, no matter how smart or correct it is.
Looking up the thread, I don't see anyone taking such an extreme position: never recover from SEGV even if it can be done 100% correctly. The sentiment that I see and the one that I share is that it is extremely difficult (and maybe impossible) to do correctly. However, if someone comes up with a smart solution, I would be very much interested to see it. While by the time you get a SIGSEGV, you process is likely to be beyond recovery, I don't think the same applies to SIGFPE. It may also be possible to get rid of the arbitrary recursion limit on Linux (I've heard this problem is solved on Windows) by being smart about handling SIGSEGV. Finally, providing some diagnostic before exiting on hard errors is not without precedent: I believe R has such a feature. It may be worthwhile to compare Victor's approach to what is done in R. It may, however, be better to move further discussion to the tracker (I understand that the patch is at <http://bugs.python.org/issue3999>).
On Wed, Dec 10, 2008 at 5:21 PM, Alexander Belopolsky <alexander.belopolsky@gmail.com> wrote:
On Wed, Dec 10, 2008 at 6:12 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
Several people already said (essentially) that: -1. I don't think such code should be added to the Python core, no matter how smart or correct it is.
Looking up the thread, I don't see anyone taking such an extreme position: never recover from SEGV even if it can be done 100% correctly. The sentiment that I see and the one that I share is that it is extremely difficult (and maybe impossible) to do correctly. However, if someone comes up with a smart solution, I would be very much interested to see it.
It is impossible to do in general, and I am -1 on any misguided attempts to do so.
While by the time you get a SIGSEGV, you process is likely to be beyond recovery, I don't think the same applies to SIGFPE.
No, it's as much about the context as it is the error. We could write our own floating point code that can recover from SIGFPE (which isn't portable, but still mostly doable), but enabling it for arbitrary third-party libraries is completely unsafe. Printing a stack trace and then aborting would be possible and useful though.
It may also be possible to get rid of the arbitrary recursion limit on Linux (I've heard this problem is solved on Windows) by being smart about handling SIGSEGV.
If we could calculate how much stack is left we'd have a much more robust way of doing recursion limits. I suppose this could be done by reading a byte from each page with a temporary SIGSEGV handler installed, but I'm not convinced you can't ask the platform directly somehow. I'd also be considered about thread-safety.
Finally, providing some diagnostic before exiting on hard errors is not without precedent: I believe R has such a feature. It may be worthwhile to compare Victor's approach to what is done in R.
It may, however, be better to move further discussion to the tracker (I understand that the patch is at <http://bugs.python.org/issue3999>).
-- Adam Olsen, aka Rhamphoryncus
On Wed, Dec 10, 2008 at 8:01 PM, Adam Olsen <rhamph@gmail.com> wrote: ..
It is impossible to do in general, and I am -1 on any misguided attempts to do so.
I agree, recovering from segfaults caused by buggy third party C modules is a losing proposition, but for a limited number of conditions that can be triggered from python code running on a non-buggy interpreter (hopefully ctypes included, but that would be hard), converting signals into exceptions may be possible. ..
Printing a stack trace and then aborting would be possible and useful though.
Even a simple dialog: Python have encountered a segfault, would you like to dump core? y/n in the interactive session will be quite useful.
If we could calculate how much stack is left we'd have a much more robust way of doing recursion limits. I suppose this could be done by reading a byte from each page with a temporary SIGSEGV handler installed, but I'm not convinced you can't ask the platform directly somehow. I'd also be considered about thread-safety.
It's something as hard as taking address of local variable at the beginning of the program and at any arbitrary point. Of course 'how much is left' means additional arithmetics. Cheers, fijal
Hi Martin, On Dec 11, 2008, at 12:12 AM, Martin v. Löwis wrote:
Several people already said (essentially) that: -1. I don't think such code should be added to the Python core, no matter how smart or correct it is.
does your -1 apply only to attempts to resume execution after SIGSEGV, or also to the idea of dumping the stack and immediately exiting? The former strikes me as crazy talk, while the latter is genuinely useful. Cheers, -- Ivan Krstić <krstic@solarsail.hcs.harvard.edu> | http://radian.org
On Dec 11, 2008, at 12:12 AM, Martin v. Löwis wrote:
Several people already said (essentially) that: -1. I don't think such code should be added to the Python core, no matter how smart or correct it is.
does your -1 apply only to attempts to resume execution after SIGSEGV, or also to the idea of dumping the stack and immediately exiting? The former strikes me as crazy talk, while the latter is genuinely useful.
Only to the former. If it is actually possible to print a stack trace, that could be useful indeed. I'm then skeptical that this is possible in the general case (i.e. displaying the full C stack), but displaying (parts of) the Python stack might be possible. I think it should still proceed to dump core, so that you can then inspect the core with a proper debugger. Regards, Martin
On Dec 11, 2008, at 3:05 PM, Martin v. Löwis wrote:
If it is actually possible to print a stack trace, that could be useful indeed. I'm then skeptical that this is possible in the general case (i.e. displaying the full C stack), but displaying (parts of) the Python stack might be possible. I think it should still proceed to dump core, so that you can then inspect the core with a proper debugger.
+1. Victor, any interest in attempting to retool your patch in this direction? -- Ivan Krstić <krstic@solarsail.hcs.harvard.edu> | http://radian.org
participants (16)
-
"Martin v. Löwis" -
Adam Olsen -
Alexander Belopolsky -
Antoine Pitrou -
Benjamin Peterson -
BJörn Lindqvist -
Daniel Stutzbach -
Ivan Krstić -
James Y Knight -
Jeffrey Yasskin -
M.-A. Lemburg -
Maciej Fijalkowski -
Simon Cross -
skip@pobox.com -
Terry Reedy -
Victor Stinner