[Python-Dev] Signals+Threads (PyGTK waking up 10x/sec).

Sun Dec 9 02:30:09 CET 2007

On Dec 8, 2007 5:21 PM, Guido van Rossum <guido at python.org> wrote:
>
> On Dec 8, 2007 3:57 PM, Adam Olsen <rhamph at gmail.com> wrote:
> >
> > On Dec 8, 2007 4:28 PM, Guido van Rossum <guido at python.org> wrote:
> > >
> > > On Dec 8, 2007 2:36 PM, Adam Olsen <rhamph at gmail.com> wrote:
> > > > On Dec 8, 2007 2:56 PM,  <glyph at divmod.com> wrote:
> > > > > On 05:20 pm, guido at python.org wrote:
> > > > > >The best solution I can think of is to add a new API that takes a
> > > > > >signal and a file descriptor and registers a C-level handler for that
> > > > > >signal which writes a byte to the file descriptor. You can then create
> > > > > >a pipe, connect the signal handler to the write end, and add the read
> > > > > >end to your list of file descriptors passed to select() or poll(). The
> > > > > >handler must be written in C in order to avoid the race condition
> > > > > >referred to by Glyph (signals arriving after the signal check in the
> > > > > >VM main loop but before the select()/poll() system call is entered
> > > > > >will not be noticed until the select()/poll() call completes).
> > > > >
> > > > > This paragraph jogged my memory.  I remember this exact solution being
> > > > > discussed now, a year ago when I was last talking about these issues.
> > > > >
> > > > > There's another benefit to implementing a write-a-byte C signal handler.
> > > > > Without this feature, it wouldn't make sense to have passed the
> > > > > SA_RESTART flag to sigaction, because and GUIs written in Python could
> > > > > have spent an indefinite amount of time waiting to deliver their signal
> > > > > to Python code.  So, if you had to handle SIGCHLD in Python, for
> > > > > example, calls like file().write() would suddenly start raising a new
> > > > > exception (EINTR).  With it, you could avoid a whole class of subtle
> > > > > error-handling code in Twisted programs.
> > > >
> > > > SA_RESTART still isn't useful.  The low-level poll call (not write!)
> > > > must stop and call back into python.  If that doesn't indicate an
> > > > error you can safely restart your poll call though, and follow it with
> > > > a (probably non-blocking) write.
> > >
> > > Can't say I understand all of this, but it does reiterate that there
> > > are more problems with signals than just the issue that Gustavo is
> > > trying to squash. The possibility of having *any* I/O interrupted is
> > > indeed a big worry. Though perhaps this could be alleviated by rigging
> > > things so that signals get delivered (at the C level) to the main
> > > thread and the rest of the code runs in a non-main thread?
> >
> > That's the approach my threading patch will take, although reversed
> > (signals are handled by a background thread, leaving the main thread
> > as the *main* thread.)
>
> Hm... Does this mean you're *always* creating an extra thread to handle signals?

Yup, Py_Initialize will do it.

> > I share your concern about interrupting whatever random syscalls (not
> > even limited to I/O!) that a library happens to use.
> >
> >
> > > > Note that the only reason to use C for a low-level handler here is
> > > > give access to sigatomic_t and avoid needing locks.  If you ran the
> > > > signal handler in a background thread (using sigwait to trigger them)
> > > > you could use a python handler.
> > >
> > > I haven't seen Gustavo's patch yet, but *my* reason for using a C
> > > handler was different -- it was because writing a byte to a pipe in
> > > Python would do nothing to fix Gustavo's issue.
> > >
> > > Looking at the man page for sigwait()  it could be an alternative
> > > solution, but I'm not sure how it would actually allow PyGTK to catch
> > > KeyboardInterrupt.
> >
> > My mail at [1] was referring to this.  Option 1 involved writing to a
> > pipe that gets polled while option 2 requires we generate a new signal
> > targeting the specific thread we want to interrupt.
> >
> > I'd like to propose an interim solution though: pygtk could install
> > their own SIGINT handler during the gtk mainloop (or all gtk code?),
> > have it write to a pipe monitored by gtk, and have gtk raise
> > KeyboardInterrupt if it gets used.  This won't allow custom SIGINT
> > handlers or any other signal handlers to run promptly, but it should
> > be good enough for OLPC's use case.
> >
> >
> > [1] http://mail.python.org/pipermail/python-dev/2007-December/075607.html
>
> Since OLPC has to use 2.5 they don't really have another choice
> besides this or making the timeout (perhaps much) larger -- I'm not
> going to risk a change as big as anything proposed here for 2.5.2, so
> nothing will change before 2.6.
>
> I've got to say that all the cross-referencing and asynchronous
> discussion here makes it *very* difficult to wrap my head around the
> various proposals. It also doesn't help that different participants
> appear to have different use cases in mind. E.g. do we care about
> threads started directly from C++ code? (These happen all the time at
> Google, but we don't care much about signals.) And what about
> restarting system calls (like Glyph brought up)?
>
> I've seen references to bug #1643738 which got a thumbs up from Tim
> Peters -- Adam, what do you think of that? I know it doesn't address
> Gustavo's issue but it seems useful in its own right.

It's a step in the right direction (and I don't think it will break
anything), but I don't think it's enough to make anything entirely
correct either.

Hrm.  If we replaced Py_AddPendingCall with a single flag (what
is_tripped is now), and had the main thread check it directly, I think
that'd avoid the corruption risks.  That's with bug #1643738, and
assuming sig_atomic_t functions sanely.

To summarize, there's two problems to be solved:
1) low-level corruption in the signal handlers as they record a new
signal, such as in Py_AddPendingCalls
2) high-level wakeup race: "check for pending signals, have a signal
come in, then call a blocking syscall/library (oblivious to the new
signal)."

> Gustavo, at some point you suggested making changes to Python so that
> all signals are blocked in all threads except for the main thread. I
> think I'd be more inclined to give that the green light than the patch
> using pipes for all signal handling, as long as we can make sure that
> this blocking of all signals isn't inherited by fork()'ed children --
> we had serious problems with that in 2.4 where child processes were
> unkillable (except for SIGKILL). I'd also be OK with a patch that
> leaves the existing signal handling code intact but *adds* a way to
> have a signal handler written in C that writes one byte to one end of
> a pipe -- where the pipe is provided by Python code.

I'm not sure this helps anything (without a dedicated signal-handler
thread).  It doesn't avoid interrupting random syscalls, and the
current code should already ensure only the main thread does any real
processing.

The "if (getpid() == main_pid) {" could use a more precise comment
though.  If I understand correctly, POSIX requires that to always
evaluate to true.  The only time it returns false is on LinuxThreads
(pre-NPTL), where it cancels out another bug of sending the signal to
ALL threads (rather than picking one at random.)

> Does any of this make sense still?
>
> Anyway, I would still like to discuss this on #python-dev Monday.
> Adam, in what time zone are you? (I'm PST.) Who else is interested?

MST.

-- 
Adam Olsen, aka Rhamphoryncus