I recently came up with a fix for thread support in Python under Cygwin. Jason Tishler and Norman Vine are looking it over, but I'm pretty sure something similar should be used for the Cygwin Python port. This is easily done--simply add a few lines to thread.c and create a new thread_cygwin.h (context diff and new file both provided). But there is a larger issue: The thread interface code in thread_pthread.h uses mutexes and condition variables to emulate semaphores, which are then used to provide Python "lock" and "sema" services. I know this is a common practice since those two thread synchronization primitives are defined in "pthread.h". But it comes with quite a bit of overhead. (And in the case of Cygwin causes race conditions, but that's another matter.) POSIX does define semaphores, though. (In fact, it's in the standard just before Mutexes and Condition Variables.) According to POSIX, they are found in <semaphore.h> and _POSIX_SEMAPHORES should be defined if they work as POSIX expects. If they are available, it seems like providing direct semaphore services would be preferable to emulating them using condition variables and mutexes. thread_posix.h.diff-c is a context diff that can be used to convert thread_pthread.h into a more general POSIX version that will use semaphores if available. thread_cygwin.h would no longer be needed then, since all it does is uses POSIX semaphores directly rather than mutexes/condition vars. Changing the interface to POSIX threads should bring a performance improvement to any POSIX platform that supports semaphores directly. Does this sound like a good idea? Should I create a more thorough set of patch files and submit them? (I haven't been accepted to the python-dev list yet, so please CC me. Thanks.) -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
[Gerald S. Williams]
I recently came up with a fix for thread support in Python under Cygwin. Jason Tishler and Norman Vine are looking it over, but I'm pretty sure something similar should be used for the Cygwin Python port.
This is easily done--simply add a few lines to thread.c and create a new thread_cygwin.h (context diff and new file both provided).
But there is a larger issue:
The thread interface code in thread_pthread.h uses mutexes and condition variables to emulate semaphores, which are then used to provide Python "lock" and "sema" services.
Please use current CVS Python for patches. For example, all the "sema" code no longer exists (it was undocumented and unused).
I know this is a common practice since those two thread synchronization primitives are defined in "pthread.h". But it comes with quite a bit of overhead. (And in the case of Cygwin causes race conditions, but that's another matter.)
POSIX does define semaphores, though. (In fact, it's in the standard just before Mutexes and Condition Variables.)
Semaphores weren't defined by POSIX at the time this code was written; IIRC, they were first introduced in the later and then-rarely implemented POSIX realtime extensions. How stable are they? Some quick googling didn't inspire a lot of confidence, but maybe I was just bumping into early bug reports.
According to POSIX, they are found in <semaphore.h> and _POSIX_SEMAPHORES should be defined if they work as POSIX expects.
This may be a nightmare; for example, I don't see anything in the Single UNIX Specification about this symbol, and as far as I'm concerned POSIX as a distinct standard is a DSW (dead standard walking <wink>). That's one for the Unixish geeks to address.
If they are available, it seems like providing direct semaphore services would be preferable to emulating them using condition variables and mutexes.
They could be hugely better on Linux, but I don't know: there's anecdotal evidence that Linux scheduling of threads competing for a mutex can get itself into a vastly unfair state. Provided Linux implements semaphores properly, sempahore contention can be tweaked (and Python should do so), as befits a realtime gimmick, to guarantee fairness (SCHED_FIFO and SCHED_RR).
thread_posix.h.diff-c is a context diff that can be used to convert thread_pthread.h into a more general POSIX version that will use semaphores if available.
I believe your PyThread_acquire_lock() code has two holes: 1. sem_trywait() is not checked for an error return. 2. sem_wait() and sem_trywait() can be interrupted by signal, and that's not an error condition. So these calls should be stuck in a loop: do { ... call the right one ... } while (status < 0 && errno == EINTR); if (status < 0) { /* an unexpected exceptional return */ ... }
... Does this sound like a good idea?
Yes, provided it works <wink>.
Should I create a more thorough set of patch files and submit them?
I'd like that, but please don't email patches -- they'll just be forgotten. Upload patches to the Python patch manager instead: http://sf.net/tracker/?group_id=5470&atid=305470 Discussion about the patches remains appropriate on Python-Dev.
"Tim Peters" <tim@zope.com> writes:
Semaphores weren't defined by POSIX at the time this code was written; IIRC, they were first introduced in the later and then-rarely implemented POSIX realtime extensions. How stable are they?
They are in Single UNIX V2 (1997), so anybody claiming conformance to Single UNIX has implemented them: - AIX 4.3.1 and later - Tru64 UNIX V5.1A and later - Solaris 7 and later [from the list of certified Unix98 systems] In addition, the following implementations document support for sem_init: - LinuxThreads since glibc 2.0 (1996) - IRIX atleast since 6.5 (a patch for 6.2 is available since 1996)
According to POSIX, they are found in <semaphore.h> and _POSIX_SEMAPHORES should be defined if they work as POSIX expects.
This may be a nightmare; for example, I don't see anything in the Single UNIX Specification about this symbol, and as far as I'm concerned POSIX as a distinct standard is a DSW (dead standard walking <wink>). That's one for the Unixish geeks to address.
You didn't ask google for _POSIX_SEMAPHORES, right? The first hit brings you to http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html _POSIX_SEMAPHORES Implementation supports the Semaphores option. A quick check shows that both Solaris 8 and glibc 2.2 do indeed define the symbol.
They could be hugely better on Linux, but I don't know: there's anecdotal evidence that Linux scheduling of threads competing for a mutex can get itself into a vastly unfair state.
For glibc 2.1, semaphores have been reimplemented; they now provide FIFO wakeup (sorted by thread priority). Same for mutexes: the highest-priority oldest-waiting thread will be resumed.
do { ... call the right one ... } while (status < 0 && errno == EINTR);
Shouldn't EINTR check for KeyboardInterrupt? Regards, Martin
[Martin v. Loewis]
... You didn't ask google for _POSIX_SEMAPHORES, right? The first hit brings you to
http://www.opengroup.org/onlinepubs/7908799/xsh/feature.html
_POSIX_SEMAPHORES Implementation supports the Semaphores option.
Good catch! I didn't get a hit from the Open Group's SUS search box: http://www.opengroup.org/onlinepubs/7908799/
A quick check shows that both Solaris 8 and glibc 2.2 do indeed define the symbol.
Cool.
... For glibc 2.1, semaphores have been reimplemented; they now provide FIFO wakeup (sorted by thread priority). Same for mutexes: the highest-priority oldest-waiting thread will be resumed.
My impression is that some at Zope Corp would find it hard to believe that works.
do { ... call the right one ... } while (status < 0 && errno == EINTR);
Shouldn't EINTR check for KeyboardInterrupt?
Sorry, too much a can of worms for me -- the question and the possible answers are irrelevant on my box <wink>. Complications include that interrupts weren't able to break out of a wait on a Python lock before (so you'd change endcase semantics). If you don't care about that, how would you go about "checking for KeyboardInterrupt"? Note thread.c's initial comment: /* Thread package. This is intended to be usable independently from Python. That's why there are no calls to Python runtime functions in thread_pthread.h (etc) files now; e.g., they call malloc() and free() directly, and don't reference any PyExc_XXX symbols. That's a lot to overcome just to break existing code <wink>.
Tim Peters wrote:
Please use current CVS Python for patches. For example, all the "sema" code no longer exists (it was undocumented and unused).
DOH! Sorry, I thought of that after pressing SEND. I had been using a specific Cygwin version to relay and test the proposed changes. DOH again! I just realized that a thread_nt.h patch that I submitted to the patch manager has the same problem! I'd better go get the latest CVS sources before commenting any further about the code... You and Martin have good points about the implementation, some of which I had intended to address once I knew which implementation to target. It sounds like I'll be targetting the general POSIX thread version of Python's thread interface code. I'd definitely at least check for _POSIX_SEMAPHORES before changing the behavior, though. One question left is whether to continue calling the file thread_pthread.h or to rename it thread_posix.h.
/* Thread package. This is intended to be usable independently from Python.
That's why there are no calls to Python runtime functions in thread_pthread.h (etc) files now; e.g., they call malloc() and free() directly, and don't reference any PyExc_XXX symbols. That's a lot to overcome just to break existing code <wink>.
Actually, this isn't true. The current thread_nt.h creates a Python dictionary to keep track of thread handles. This was what my earlier patch was for--the dictionary isn't even used (and creates a memory leak to boot). I proposed removing it entirely (along with the #include <Python.h>). I'll update my previous patch with one based on current CVS sources. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
I submitted another patch to the POSIX threads code. It is under SourceForge as patch number 533681. This tells Python to explicitly use the POSIX semaphore code for Cygwin. I had inadvertently left a remnant in my version of thread.c that forced _POSIX_SEMAPHORES to be defined for Cygwin. It turns out _POSIX_SEMAPHORES is only set if __rtems__ is defined. At the time I didn't know what that meant but thought it must have been defined since I got the correct code. If you'd prefer, I could provide a patch to thread.c instead. Sorry about that. I really meant to wrap it all up in the first patch. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
[Gerald S. Williams]
I submitted another patch to the POSIX threads code. It is under SourceForge as patch number 533681.
This tells Python to explicitly use the POSIX semaphore code for Cygwin. I had inadvertently left a remnant in my version of thread.c that forced _POSIX_SEMAPHORES to be defined for Cygwin.
It turns out _POSIX_SEMAPHORES is only set if __rtems__ is defined. At the time I didn't know what that meant but thought it must have been defined since I got the correct code.
I don't understand. If Cygwin requires _rtems_ in order that _POSIX_SEMAPHORES be defined, then either Cygwin has a bug here, or Cygwin *needs* _rtems_ if you want to use real-time gimmicks like semaphores. In either case, I don't think it's Python's place to second-guess the Cygwin team: report it as a bug to Cygwin, or do whatever they recommend to get _rtems_ defined in the Cygwin build.
Tim Peters wrote:
I don't understand. If Cygwin requires _rtems_ in order that _POSIX_SEMAPHORES be defined, then either Cygwin has a bug here, or Cygwin *needs* _rtems_ if you want to use real-time gimmicks like semaphores. In either case, I don't think it's Python's place to second-guess the Cygwin team: report it as a bug to Cygwin, or do whatever they recommend to get _rtems_ defined in the Cygwin build.
_rtems_ is actually from newlib, not Cygwin.s Here's the comment I added to SourceForge: Before _POSIX_SEMAPHORES is specified by default for Cygwin, it will probably have to be shown that it is 100% compliant with POSIX. Whether or not this is the case, the POSIX semaphore implementation is the one that should be used for Cygwin (it has been verified and approved by the Cygwin Python maintainer, etc.). Prior to this, threading had been disabled for Cygwin Python, so this is really more of a port-to-Cygwin than a workaround. This could have been implemented in a new file (thread_cygwin.h), although during implementation it was discovered that the change for Cygwin would also benefit POSIX semaphore users in general. The threading module overall is highly platform-specific, especially with regard to redefining POSIX symbols for specific platforms. In particular, this is done for the following platforms: __DGUX __sgi __ksr__ anything using SOLARIS_THREADS __MWERKS__ However, except for those using SOLARIS_THREADS, these are specified in thread.c. I will therefore resubmit the patch as a change to thread.c instead. The reference to __rtems__ actually comes from newlib, which Cygwin uses. It doesn't apply to Cygwin. -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
"Gerald S. Williams" <gsw@agere.com> writes:
Before _POSIX_SEMAPHORES is specified by default for Cygwin, it will probably have to be shown that it is 100% compliant with POSIX.
Please don't guess in such matters; this is not very convincing. The Posix spec says this: An implementation that claims conformance to this Feature Group shall set the macro _XOPEN_REALTIME to a value other than -1. An implementation that does not claim conformance shall set _XOPEN_REALTIME to -1. The POSIX Realtime Extension defines the following symbolic constants and their meaning: ... _POSIX_SEMAPHORES Implementation supports the Semaphores option. So the right way to not claim conformance is to set _XOPEN_REALTIME, not to not set _POSIX_SEMAPHORES.
The threading module overall is highly platform-specific, especially with regard to redefining POSIX symbols for specific platforms.
Yes, this is a terrible thing. I think most of it should be ripped out, since nobody can verify anymore which of this #ifdef mess is still in use, and still doing the right thing on the platforms on which it is activated. Standards are there to implement them if they are useful, and to simplify life of users of these standards; anybody not following standards when they exist deserves to lose. In any case, this #ifdef maze should not be extended. Regards, Martin
Martin v. Loewis wrote:
Before _POSIX_SEMAPHORES is specified by default for Cygwin, it will probably have to be shown that it is 100% compliant with POSIX.
Please don't guess in such matters; this is not very convincing.
Sorry about the wording. I'm looking at ISO/IEC 9945-1-1996 (i.e., ANSI/IEEE Std 1003.1, the POSIX API spec), and the section on semaphores does give quite specific requirements that have to be met if you define _POSIX_SEMAPHORES. <Sigh> I'll work with Jason and get _POSIX_SEMAPHORES defined in Cygwin or (if this isn't possible) present an alternative that doesn't imply POSIX support. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
Jerry, On Fri, Mar 22, 2002 at 04:16:30PM -0500, Tim Peters wrote:
[Gerald S. Williams]
I submitted another patch to the POSIX threads code. It is under SourceForge as patch number 533681.
[snip]
After I saw your patch, I got started thinking...
I don't understand. If Cygwin requires _rtems_ in order that _POSIX_SEMAPHORES be defined, then either Cygwin has a bug here, or Cygwin *needs* _rtems_ if you want to use real-time gimmicks like semaphores. In either case, I don't think it's Python's place to second-guess the Cygwin team: report it as a bug to Cygwin, or do whatever they recommend to get _rtems_ defined in the Cygwin build.
I agree with Tim (and Martin) that this is a Cygwin (i.e., newlib) issue and should not be "fixed" or worked around in Python. Would you be willing to submit the attached (untested) patch to newlib after giving it a spin? I think that Rob Collins forgot to add this missing #define when he implemented Cygwin semaphore support. If you wish, I can post to cygwin-developers to verify. Thanks, Jason
On Sat, Mar 23, 2002 at 11:08:23AM -0500, Jason Tishler wrote:
I agree with Tim (and Martin) that this is a Cygwin (i.e., newlib) issue and should not be "fixed" or worked around in Python. Would you be willing to submit the attached (untested) patch to newlib after giving it a spin?
Just a FYI... This issue has been resolved in Cygwin (i.e., newlib). If interested, see the following for details: http://sources.redhat.com/ml/newlib/2002/msg00122.html I will (finally) release a threaded Cygwin Python as soon as Cygwin 1.3.11 is released. Jason
Tim Peters [mailto:tim@zope.com] wrote:
They could be hugely better on Linux, but I don't know: there's anecdotal evidence that Linux scheduling of threads competing for a mutex can get itself into a vastly unfair state. Provided Linux implements semaphores properly, sempahore contention can be tweaked (and Python should do so), as befits a realtime gimmick, to guarantee fairness (SCHED_FIFO and SCHED_RR).
I submitted patch request 525532 that will enable semaphore use in thread_pthread.h if _POSIX_SEMAPHORES is defined. It includes proper checking of error codes and looping if EINTR is received (as you rightly pointed out). I didn't add any specific checks for a keyboard interrupt. Checks could be added in the loop for specific platforms as needed. I'm not sure if this is an issue anyway. To quote the POSIX standard (ISO/IEC 9945-l: 1996 aka ANSI/IEEE Std 1003.1, 1996 Edition): If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread shall resume waiting for the mutex as if it was not interrupted. and: If a signal is delivered to a thread waiting for a condition variable, upon return from the signal handler the thread shall resume waiting for the condition variable as if it was not interrupted, or it shall return zero due to spurious wakeup. (Spurious wakeup of cond_wait is another whole can of worms.) ----- I haven't done much mucking around with scheduling policies under UNIX, but the only portable way I found to modify the way semaphores contend is to change the thread scheduling policy. I have a patchfile for that as well. It mostly boils down to adding the following code before the thread is created: #ifdef USE_EXPLICIT_SCHED pthread_attr_setinheritsched(&attrs, PTHREAD_EXPLICIT_SCHED); #ifdef PTHREAD_THREAD_RR_SCHED_POLICY_SUPPORTED pthread_attr_setschedpolicy(&attrs, SCHED_RR); #else /* PTHREAD_THREAD_FIFO_SCHED_POLICY_SUPPORTED */ pthread_attr_setschedpolicy(&attrs, SCHED_FIFO); #endif #endif Although as you see it requires some code to decide whether to use explicit round-robin or FIFO scheduling at all. Not all platforms allow SCHED_RR to be supported. This probably ought to be determined in the configure script. I created the PTHREAD_THREAD_*_SCHED_POLICY_SUPPORTED defines as a way to specify this. I think this should be treated as a separate patch, since the current code doesn't specify any scheduling policies. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O-
"Gerald S. Williams" <gsw@agere.com> writes:
I think this should be treated as a separate patch, since the current code doesn't specify any scheduling policies.
Indeed. Also, on some systems, certain priviledges are needed to establish a scheduling policy. So I think this should *not* automatically be activated. Regards, Martin
[Gerald S. Williams Sent: Monday, March 04, 2002 10:44 AM ]
I submitted patch request 525532 that will enable semaphore use in thread_pthread.h if _POSIX_SEMAPHORES is defined. It includes proper checking of error codes and looping if EINTR is received (as you rightly pointed out).
Cool! I gave it a +1, but I'm not on a pthreads platform and someone who is needs to continue the process.
I didn't add any specific checks for a keyboard interrupt. Checks could be added in the loop for specific platforms as needed.
I'm deadly opposed to letting a keyboard interrupt break out of a wait for a Python lock.
I'm not sure if this is an issue anyway. To quote the POSIX standard (ISO/IEC 9945-l: 1996 aka ANSI/IEEE Std 1003.1, 1996 Edition): If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread shall resume waiting for the mutex as if it was not interrupted. and: If a signal is delivered to a thread waiting for a condition variable, upon return from the signal handler the thread shall resume waiting for the condition variable as if it was not interrupted, or it shall return zero due to spurious wakeup.
Sorry, I don't grasp what the point of this quoting was, unless it was a roundabout way of merely confirming that keyboard interrupts can't break out of a wait for a Python lock today (which was known and said before).
I was trying to avoid taking sides on the keyboard interrupt issue. I do prefer only having to type ^C once, but Python doesn't always do that now (at least not on my platforms), and there are certainly issues trying to do it portably. I wouldn't want to be involved in that effort. The point of the quote was to show that Mutexes/Condition variables have (or at least can have) the same behavior wrt interrupts as this: do { status = sem_wait(lock); } while ((status == -1) && (errno == EINTR)); So we can treat keyboard interrupts as a separate issue. -Jerry -O Gerald S. Williams, 22Y-103GA : mailto:gsw@agere.com O- -O AGERE SYSTEMS, 555 UNION BLVD : office:610-712-8661 O- -O ALLENTOWN, PA, USA 18109-3286 : mobile:908-672-7592 O- Tim Peters wrote:
I'm deadly opposed to letting a keyboard interrupt break out of a wait for a Python lock. [...]
If a signal is delivered to a thread waiting for a mutex, upon return from the signal handler the thread shall resume waiting for the mutex as if it was not interrupted. and: If a signal is delivered to a thread waiting for a condition variable, upon return from the signal handler the thread shall resume waiting for the condition variable as if it was not interrupted, or it shall return zero due to spurious wakeup.
Sorry, I don't grasp what the point of this quoting was, unless it was a roundabout way of merely confirming that keyboard interrupts can't break out of a wait for a Python lock today (which was known and said before).
[Tim]
I'm deadly opposed to letting a keyboard interrupt break out of a wait for a Python lock.
I've heard this before. Can you explain why? It would not be incorrect -- if the SIGINT had arrived a microsecond before the wait started, the program would have been interrupted in exactly the same state. It's one of the few places where code can be blocked in a system call (if you want to call a lock wait a system call -- it's close enough for me) and a ^C doesn't stop it, and that can be annoying at times. Of course, if it's not the main thread, signals including SIGINT shouldn't do anything, but that's a separate issue. Breaking the main thread IMO is useful behavior for interactive programs and for scripts invoked from the command line. (In practice, this is probably only interesting for interactive use -- if you hang your main thread on a deadlock, there's no way to get your >>> prompt back, and that may mean no way to figure out what went wrong or save stuff you wanted to save. --Guido van Rossum (home page: http://www.python.org/~guido/)
[Tim]
I'm deadly opposed to letting a keyboard interrupt break out of a wait for a Python lock.
[Guido]
I've heard this before. Can you explain why?
The code would be a mess, it wouldn't work across platforms (e.g., interrupts are invisible to a wait on a POSIX condvar), and it would change current behavior.
It would not be incorrect -- if the SIGINT had arrived a microsecond before the wait started, the program would have been interrupted in exactly the same state.
Ya, and if my program has been deadlocked for 4 hours, it's exactly the same as if it had simply been swapped out for that long without deadlock, and then the interrupt occurred right before it was about to grab the fatal lock. Try explaining that to a user base that thinks time.sleep() is an advanced synchronization technique <wink>.
It's one of the few places where code can be blocked in a system call (if you want to call a lock wait a system call -- it's close enough for me)
I'd be more upset about that if it weren't the *purpose* of lock.acquire() to block <wink>. If a user doesn't want to block, they should poll, acquire-with-timeout, or fix their bad assumptions.
and a ^C doesn't stop it, and that can be annoying at times.
Of course, if it's not the main thread, signals including SIGINT shouldn't do anything, but that's a separate issue.
Why should the main thread act differently?
Breaking the main thread IMO is useful behavior for interactive programs and for scripts invoked from the command line.
Being able to interrupt any thread may be useful. I guess I don't see what's especially useful about breaking the main thread. If my program is hosed, I'd just as soon kill the whole thing.
(In practice, this is probably only interesting for interactive use -- if you hang your main thread on a deadlock, there's no way to get your prompt back, and that may mean no way to figure out what went wrong or save stuff you wanted to save.
Hmm. The "save stuff" use may be most valuable for non-interactive long-running jobs, assuming that it's possible to save stuff despite that the rest of your threads remain deadlocked (implying they're all holding *some* lock). I suppose if you can guess *which* lock the main thread broke out of, you could at least release that one and hope for some progress ... I don't know. If the possibility were there, I expect one could, with care, rely on its details to build some more-or-less useful scheme on top of it -- at least on platforms where it worked. It's really not all that attractive on its own; maybe learning how to build efficient interruptible locks x-platform could lead to a more general gimmick, though.
It's one of the few places where code can be blocked in a system call (if you want to call a lock wait a system call -- it's close enough for me)
I'd be more upset about that if it weren't the *purpose* of lock.acquire() to block <wink>. If a user doesn't want to block, they should poll, acquire-with-timeout, or fix their bad assumptions.
I was thinking of the situation where a user learning about threads and locks gets in trouble in an interactive session, by typing something that grabs a lock that won't ever be released. Telling them they shouldn't have done that isn't going to help them. IMO this is the same kind of situation as comparing a recursive list to itself: this used to crash due to a stack overflow, and we cared enough about this "don't do that then" situation to fix it.
and a ^C doesn't stop it, and that can be annoying at times.
Of course, if it's not the main thread, signals including SIGINT shouldn't do anything, but that's a separate issue.
Why should the main thread act differently?
By fiat, only the main thread in Python is supposed to get signals.
Breaking the main thread IMO is useful behavior for interactive programs and for scripts invoked from the command line.
Being able to interrupt any thread may be useful. I guess I don't see what's especially useful about breaking the main thread. If my program is hosed, I'd just as soon kill the whole thing.
Interactively, the main thread is important.
(In practice, this is probably only interesting for interactive use -- if you hang your main thread on a deadlock, there's no way to get your prompt back, and that may mean no way to figure out what went wrong or save stuff you wanted to save.
Hmm. The "save stuff" use may be most valuable for non-interactive long-running jobs, assuming that it's possible to save stuff despite that the rest of your threads remain deadlocked (implying they're all holding *some* lock). I suppose if you can guess *which* lock the main thread broke out of, you could at least release that one and hope for some progress ...
I wasn't thinking of long-running non-interactive jobs. If you design one of those, you should know what you are doing. I was thinking of the poor interactive user who hung their interpreter by accident.
I don't know. If the possibility were there, I expect one could, with care, rely on its details to build some more-or-less useful scheme on top of it -- at least on platforms where it worked. It's really not all that attractive on its own; maybe learning how to build efficient interruptible locks x-platform could lead to a more general gimmick, though.
Yeah, unfortunately the only implementation technique I have to offer right now is to turn all acquire calls internally into a loop around an acquire call with a timeout in the order of 1 second, and to check signals each time around the loop. :-( --Guido van Rossum (home page: http://www.python.org/~guido/)
On Mon, Mar 18, 2002, Guido van Rossum wrote:
Yeah, unfortunately the only implementation technique I have to offer right now is to turn all acquire calls internally into a loop around an acquire call with a timeout in the order of 1 second, and to check signals each time around the loop. :-(
-1 The problem with this is that you really start tying yourself in knots. If you do a time.sleep() for one second, that can't be interrupted for a full second, either on keyboard interrupt or on releasing the lock. The finer-grained you go on time.sleep(), the more overhead you consume, and the coarser-grained you go, the more you limit throughput. I'm not aware of any good cross-platform technique for managing thread timeouts that is also "live". -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ The best way to get information on Usenet is not to ask a question, but to post the wrong information. --Aahz
[Guido]
I was thinking of the situation where a user learning about threads and locks gets in trouble in an interactive session, by typing something that grabs a lock that won't ever be released. Telling them they shouldn't have done that isn't going to help them.
It will if it teaches them not to do it again <wink>.
IMO this is the same kind of situation as comparing a recursive list to itself: this used to crash due to a stack overflow, and we cared enough about this "don't do that then" situation to fix it.
A prime difference is that I've never seen anyone report this problem, and I still pay attention to the Help and Tutor lists. Reports of recursive compare blowups came around several times each year.
... Yeah, unfortunately the only implementation technique I have to offer right now is to turn all acquire calls internally into a loop around an acquire call with a timeout in the order of 1 second, and to check signals each time around the loop. :-(
I think a practical solution would need a different lock implementation at the lowest level. The new POSIX semaphore-based lock implementation is more-or-less friendly to this, although it looks painful even so. I'm not sure what could be done on Windows, but am not inclined to take it seriously before Orlijn reports the bug.
participants (7)
-
Aahz
-
Gerald S. Williams
-
Guido van Rossum
-
Jason Tishler
-
martin@v.loewis.de
-
Tim Peters
-
Tim Peters