Questions about signal handling.

Hi all, I've got a pretty good sense of how signal handling works in the runtime (i.e. via a dance with the eval loop), but still have some questions: 1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly? More details are below. My interest in the topic relates to improving in-process interpreter isolation. #1 & #2 ----------- Toward the top of signalmodule.c we find the following comment [1] (written in 1994): /* NOTES ON THE INTERACTION BETWEEN SIGNALS AND THREADS When threads are supported, we want the following semantics: - only the main thread can set a signal handler - any thread can get a signal handler - signals are only delivered to the main thread I.e. we don't support "synchronous signals" like SIGFPE (catching this doesn't make much sense in Python anyway) nor do we support signals as a means of inter-thread communication, since not all thread implementations support that (at least our thread library doesn't). We still have the problem that in some implementations signals generated by the keyboard (e.g. SIGINT) are delivered to all threads (e.g. SGI), while in others (e.g. Solaris) such signals are delivered to one random thread (an intermediate possibility would be to deliver it to the main thread -- POSIX?). For now, we have a working implementation that works in all three cases -- the handler ignores signals if getpid() isn't the same as in the main thread. XXX This is a hack. */ At the very top of the file we see another relevant comment: /* XXX Signals should be recorded per thread, now we have thread state. */ That one was written in 1997, right after PyThreadState was introduced. So is the constraint about the main thread just a historical artifact? If not, what would be an appropriate explanation for why signals must be strictly bound to the main thread? #3 ----- Regarding the use of Py_MakePendingCalls() for signal handling, I can imagine the history there. However, is there any reason signal handling couldn't be separated from the "pending calls" machinery at this point? As far as I can tell there is no longer any strong relationship between the two. -eric [1] https://github.com/python/cpython/blob/master/Modules/signalmodule.c#L71

Le sam. 22 sept. 2018 à 01:05, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Signals can be received anytime, between two instructions at the machine code level. But the Python code base is rarely reentrant. Moreover, you can get the signal while you don't hold the GIL :-) Victor

On Sat, 22 Sep 2018 at 09:14, Victor Stinner <vstinner@redhat.com> wrote:
Le sam. 22 sept. 2018 à 01:05, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Signals can be received anytime, between two instructions at the machine code level. But the Python code base is rarely reentrant. Moreover, you can get the signal while you don't hold the GIL :-)
This would actually be the main reason to keep the current behaviour: at least some folks are running their applications in subthreads as a workaround to avoid https://bugs.python.org/issue29988 and the general fact that Ctrl-C handling and deterministic resource cleanup generally don't get along overly well. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Sep 21, 2018 at 5:11 PM Victor Stinner <vstinner@redhat.com> wrote:
Le sam. 22 sept. 2018 à 01:05, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Signals can be received anytime, between two instructions at the machine code level. But the Python code base is rarely reentrant. Moreover, you can get the signal while you don't hold the GIL :-)
Sorry, I wasn't clear. I'm not suggesting that signals should be handled outside the interpreter. Instead, why do we call PyErr_CheckSignals() in Py_MakePendingCalls() rather than distinctly, right before we call Py_MakePendingCalls()? -eric

On Fri, Sep 21, 2018 at 6:10 PM, Victor Stinner <vstinner@redhat.com> wrote:
Moreover, you can get the signal while you don't hold the GIL :-)
Note that, in Windows, SIGINT and SIGBREAK are implemented in the C runtime and linked to the corresponding console control events in a console application, such as python.exe. Console control events are delivered on a new thread (i.e. no Python thread state) that starts at CtrlRoutine in kernelbase.dll. The session server (csrss.exe) creates this thread remotely upon request from the console host process (conhost.exe).

On Fri, Sep 21, 2018 at 7:04 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
Hi all,
I've got a pretty good sense of how signal handling works in the runtime (i.e. via a dance with the eval loop), but still have some questions:
1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Here's my take on this: Handling signals in a multi-threaded program is hard. Some signals can be delivered to an arbitrary thread, some to the one that caused them. Posix provides lots of mechanisms to tune how signals are received (or blocked) by individual threads, but (a) Python doesn't expose those APIs, (b) using those APIs correctly is insanely hard. By restricting that we can only receive signals in the main thread we remove all that complexity. Restricting that signal.signal() can only be called from the main thread just makes this API more consistent (and also IIRC avoids weird sigaction() behaviour when it is called from different threads within one program). Next, you can only call reentrant functions in your signal handlers. For instance, printf() function isn't safe to use. Therefore one common practice is to set a flag that a signal was received and check it later (exactly what we do with the pending calls machinery). Therefore, IMO, the current way we handle signals in Python is the safest, most predictable, and most cross-platform option there is. And changing how Python signals API works with threads in any way will actually break the world. Yury

On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Fri, Sep 21, 2018 at 7:04 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Here's my take on this:
Handling signals in a multi-threaded program is hard. Some signals can be delivered to an arbitrary thread, some to the one that caused them. Posix provides lots of mechanisms to tune how signals are received (or blocked) by individual threads, but (a) Python doesn't expose those APIs, (b) using those APIs correctly is insanely hard. By restricting that we can only receive signals in the main thread we remove all that complexity.
Just to be clear, I'm *not* suggesting that we allow folks to specify in which Python (or kernel) thread a Python-level signal handler is called. The reason I've asked about signals is because of the semantics under subinterpreters (where there is no "main" thread). However, I don't have any plans to introduce per-interpreter signal handlers. Mostly I want to understand about the "main" thread restriction for the possible impact on subinterpreters. FWIW, I'm also mildly curious about the value of the "main" thread restriction currently. From what I can tell the restriction was made early on and there are hints in the C code that it's no longer needed. I suspect we still have the restriction solely because no one has bothered to change it. However, I wasn't sure so I figured I'd ask. :)
Restricting that signal.signal() can only be called from the main thread just makes this API more consistent
Yeah, that's what I thought.
(and also IIRC avoids weird sigaction() behaviour when it is called from different threads within one program).
Is there a good place where this weirdness is documented?
Next, you can only call reentrant functions in your signal handlers. For instance, printf() function isn't safe to use. Therefore one common practice is to set a flag that a signal was received and check it later (exactly what we do with the pending calls machinery).
We don't actually use the pending calls machinery for signals though. The only thing we do is *always* call PyErr_CheckSignals() before making any pending calls. Wouldn't it be equivalent if we called PyErr_CheckSignals() at the beginning of the eval loop right before calling Py_MakePendingCalls()? This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem.
Therefore, IMO, the current way we handle signals in Python is the safest, most predictable, and most cross-platform option there is. And changing how Python signals API works with threads in any way will actually break the world.
I agree that the way we deal with signals (i.e. set a flag that is later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't change. My original 3 questions do not relate to that. Looks like that wasn't terribly clear. :) -eric

On Mon, Sep 24, 2018 at 4:19 PM Eric Snow <ericsnowcurrently@gmail.com> wrote: [..]
Is there a good place where this weirdness is documented?
I'll need to look through uvloop & libuv commit log to remember that; will try to find time tonight/tomorrow. [..]
This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem.
Py_MakePendingCalls is a public API, even though it's not documented. If we change it to not call PyErr_CheckSignals and if there are C extensions that block pure Python code execution for long time (but call Py_MakePendingCalls explicitly), such extensions would stop reacting to ^C. Maybe a better workaround would be to introduce a concept of "main" sub-interpreter? We can then fix Py_MakePendingCalls to only check for signals when it's called from the main interpreter. Yury

On Mon, Sep 24, 2018 at 3:10 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Mon, Sep 24, 2018 at 4:19 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem.
Py_MakePendingCalls is a public API, even though it's not documented. If we change it to not call PyErr_CheckSignals and if there are C extensions that block pure Python code execution for long time (but call Py_MakePendingCalls explicitly), such extensions would stop reacting to ^C.
Maybe a better workaround would be to introduce a concept of "main" sub-interpreter? We can then fix Py_MakePendingCalls to only check for signals when it's called from the main interpreter.
I'm planning on making Py_MakePendingCalls() a backward-compatible wrapper around a new private _Py_MakePendingCalls() which supports per-interpreter operation. Then the eval loop will call the new internal function. So nothing would change for users. -eric

Please don't rely on this ugly API. *By design*, Py_AddPendingCall() tries 100 times to acquire the lock: if it fails to acquire the lock, it does notthing... your callback is ignored... By the way, recently, we had to fix yet another bug in signal handling. A new function has been added: void _PyEval_SignalReceived(void) { /* bpo-30703: Function called when the C signal handler of Python gets a signal. We cannot queue a callback using Py_AddPendingCall() since that function is not async-signal-safe. */ SIGNAL_PENDING_CALLS(); } If you want to exchange commands between two interpreters, which I see as two threads, I suggest to use two queues and something to consume queues. Victor Le lun. 24 sept. 2018 à 22:23, Eric Snow <ericsnowcurrently@gmail.com> a écrit :
On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Fri, Sep 21, 2018 at 7:04 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Here's my take on this:
Handling signals in a multi-threaded program is hard. Some signals can be delivered to an arbitrary thread, some to the one that caused them. Posix provides lots of mechanisms to tune how signals are received (or blocked) by individual threads, but (a) Python doesn't expose those APIs, (b) using those APIs correctly is insanely hard. By restricting that we can only receive signals in the main thread we remove all that complexity.
Just to be clear, I'm *not* suggesting that we allow folks to specify in which Python (or kernel) thread a Python-level signal handler is called.
The reason I've asked about signals is because of the semantics under subinterpreters (where there is no "main" thread). However, I don't have any plans to introduce per-interpreter signal handlers. Mostly I want to understand about the "main" thread restriction for the possible impact on subinterpreters.
FWIW, I'm also mildly curious about the value of the "main" thread restriction currently. From what I can tell the restriction was made early on and there are hints in the C code that it's no longer needed. I suspect we still have the restriction solely because no one has bothered to change it. However, I wasn't sure so I figured I'd ask. :)
Restricting that signal.signal() can only be called from the main thread just makes this API more consistent
Yeah, that's what I thought.
(and also IIRC avoids weird sigaction() behaviour when it is called from different threads within one program).
Is there a good place where this weirdness is documented?
Next, you can only call reentrant functions in your signal handlers. For instance, printf() function isn't safe to use. Therefore one common practice is to set a flag that a signal was received and check it later (exactly what we do with the pending calls machinery).
We don't actually use the pending calls machinery for signals though. The only thing we do is *always* call PyErr_CheckSignals() before making any pending calls. Wouldn't it be equivalent if we called PyErr_CheckSignals() at the beginning of the eval loop right before calling Py_MakePendingCalls()?
This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem.
Therefore, IMO, the current way we handle signals in Python is the safest, most predictable, and most cross-platform option there is. And changing how Python signals API works with threads in any way will actually break the world.
I agree that the way we deal with signals (i.e. set a flag that is later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't change. My original 3 questions do not relate to that. Looks like that wasn't terribly clear. :)
-eric _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com

On Sep 25, 2018, at 03:44, Victor Stinner <vstinner@redhat.com> wrote:
By the way, recently, we had to fix yet another bug in signal handling. A new function has been added:
void _PyEval_SignalReceived(void) { /* bpo-30703: Function called when the C signal handler of Python gets a signal. We cannot queue a callback using Py_AddPendingCall() since that function is not async-signal-safe. */ SIGNAL_PENDING_CALLS(); }
Is anybody else concerned about the proliferation of undocumented private C API functions? I’m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don’t appear in the Python C API documentation (AFAICT). That means they are undiscoverable, and their existence and utility is buried in institutional knowledge and obscure places within the C code. And yet, the interpreter relies heavily on them. Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API. -Barry

On Tue, 25 Sep 2018 10:01:56 -0400 Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 03:44, Victor Stinner <vstinner@redhat.com> wrote:
By the way, recently, we had to fix yet another bug in signal handling. A new function has been added:
void _PyEval_SignalReceived(void) { /* bpo-30703: Function called when the C signal handler of Python gets a signal. We cannot queue a callback using Py_AddPendingCall() since that function is not async-signal-safe. */ SIGNAL_PENDING_CALLS(); }
Is anybody else concerned about the proliferation of undocumented private C API functions? I’m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don’t appear in the Python C API documentation (AFAICT).
Not really. Many are just like "static" (i.e. module-private) functions, except that they need to be shared by two or three different C modules. It's definitely the case for _PyEval_SignalReceived(). Putting them in the C API documentation risks making the docs harder to browse through for third-party users. I think it's enough if there's a comment in the .h file explaining the given function. Regards Antoine.

On Sep 25, 2018, at 10:18, Antoine Pitrou <solipsis@pitrou.net> wrote:
Not really. Many are just like "static" (i.e. module-private) functions, except that they need to be shared by two or three different C modules. It's definitely the case for _PyEval_SignalReceived().
Purely static functions which appear only in the file they are defined in are probably fine not to document, although I do still think we should take care to comment on their semantics and external behaviors (i.e. reference counting). But if they’re used in multiple C files, then I think they *can* deserve placement within the documentation.
Putting them in the C API documentation risks making the docs harder to browse through for third-party users. I think it's enough if there's a comment in the .h file explaining the given function.
It’s a trade-off for sure. I don’t have any great ideas about how to balance that, and I don’t know what documentation techniques would help, but it does often bother me that I can’t search for them on docs.python.org. Cheers, -Barry

On Tue, Sep 25, 2018 at 8:30 AM Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 10:18, Antoine Pitrou <solipsis@pitrou.net> wrote:
Putting them in the C API documentation risks making the docs harder to browse through for third-party users. I think it's enough if there's a comment in the .h file explaining the given function.
It’s a trade-off for sure. I don’t have any great ideas about how to balance that, and I don’t know what documentation techniques would help, but it does often bother me that I can’t search for them on docs.python.org.
FWIW, I've run into the same issue. Perhaps we could have a single dedicated page in the C-API docs for internal API. It could be just a big table with a bold red warning at the top (e.g. "These are internal-only APIs, here for the benefit of folks working on CPython itself."). We *could* even have a CI check to ensure that new internal API (which doesn't happen often) gets added to the table. -eric

On Wed, 26 Sep 2018 at 00:33, Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 10:18, Antoine Pitrou <solipsis@pitrou.net> wrote:
Not really. Many are just like "static" (i.e. module-private) functions, except that they need to be shared by two or three different C modules. It's definitely the case for _PyEval_SignalReceived().
Purely static functions which appear only in the file they are defined in are probably fine not to document, although I do still think we should take care to comment on their semantics and external behaviors (i.e. reference counting). But if they’re used in multiple C files, then I think they *can* deserve placement within the documentation.
We run into this problem with the test.support helpers as well (we have more helpers than just those in the docs, but the others tend to rely on contributors and/or PR reviewers having looked at other tests that already use them). Fleshing out on the "internals" docs idea that some folks have mentioned: 1. Call it "Doc/_internals" and keep the leading underscore in the published docs 2. Use it to cover both C internals and Python internals (such as test.support) 3. Permit use of autodoc tools that we don't allow in the main docs (as these docs would be for CPython contributors, so the intended audience for the docs is the same as the audience for the code) 4. Potentially pull in some specific files and sections from the source code as literal include blocks (as per http://docutils.sourceforge.net/docs/ref/rst/directives.html#include) rather than rewriting them Cheers, Nick. P.S. While it wouldn't be usable directly, https://github.com/jnikula/hawkmoth at least demonstrates the principle of extracting Sphinx API docs from C source files. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Tue, Sep 25, 2018 at 10:04 AM Barry Warsaw <barry@python.org> wrote:
Is anybody else concerned about the proliferation of undocumented private C API functions?
I am concerned about that too. In my opinion having all those semi-private undocumented C API just contributes to the noise and artificially inflates the grey area of C API that alternative implementations *have to* support. We already have a mechanism for private header files: the "Include/internal/" directory. I think it should be mandatory to always put private C API-like functions/structs there. Yury

On Tue, Sep 25, 2018 at 9:16 AM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
We already have a mechanism for private header files: the "Include/internal/" directory. I think it should be mandatory to always put private C API-like functions/structs there.
+1 This is the main reason I created that directory. (Victor had a similar idea about the same time.) Having the separate "Include/internal" directory definitely makes it much easier to discover internal APIs, to distinguish between public and private, and to keep extension authors (and embedders) from using internal APIs without knowing what they're doing. That said, having docs for the internal APIs would still be awesome! -eric

Nobody should use _PyEval_SignalReceived(). It should only be used the the C signal handler. But if we have a separated documented for CPython internals, why not documenting private functions. At least, I would prefer to not put it at the same place an the *public* C API. (At least, a different directory.) Victor Le mar. 25 sept. 2018 à 16:05, Barry Warsaw <barry@python.org> a écrit :
On Sep 25, 2018, at 03:44, Victor Stinner <vstinner@redhat.com> wrote:
By the way, recently, we had to fix yet another bug in signal handling. A new function has been added:
void _PyEval_SignalReceived(void) { /* bpo-30703: Function called when the C signal handler of Python gets a signal. We cannot queue a callback using Py_AddPendingCall() since that function is not async-signal-safe. */ SIGNAL_PENDING_CALLS(); }
Is anybody else concerned about the proliferation of undocumented private C API functions? I’m fine with adding leading underscore functions and macros when it makes sense, but what concerns me is that they don’t appear in the Python C API documentation (AFAICT). That means they are undiscoverable, and their existence and utility is buried in institutional knowledge and obscure places within the C code. And yet, the interpreter relies heavily on them.
Maybe this is better off discussed in doc-sig but I think we need to consider documenting the private C API.
-Barry
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/vstinner%40redhat.com

On Sep 25, 2018, at 11:28, Victor Stinner <vstinner@redhat.com> wrote:
But if we have a separated documented for CPython internals, why not documenting private functions. At least, I would prefer to not put it at the same place an the *public* C API. (At least, a different directory.)
I like the idea of an “internals” C API documentation, separate from the public API. -Barry

On Tue, Sep 25, 2018 at 8:55 AM Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 11:28, Victor Stinner <vstinner@redhat.com> wrote:
But if we have a separated documented for CPython internals, why not documenting private functions. At least, I would prefer to not put it at the same place an the *public* C API. (At least, a different directory.)
I like the idea of an “internals” C API documentation, separate from the public API.
Right. IMO it should be physically separate from the public C API docs. I.e. reside in a different subdirectory of Doc, and be published at a different URL (perhaps not even under docs.python.org), since the audience here is exclusively people who want to modify the CPython interpreter, *not* people who want to write extension modules for use with CPython. And we should fully reserve the right to change their behavior incompatibly, even in bugfix releases. -- --Guido van Rossum (python.org/~guido)

On Tue, Sep 25, 2018 at 11:55 AM Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 11:28, Victor Stinner <vstinner@redhat.com> wrote:
But if we have a separated documented for CPython internals, why not documenting private functions. At least, I would prefer to not put it at the same place an the *public* C API. (At least, a different directory.)
I like the idea of an “internals” C API documentation, separate from the public API.
For that we can just document them in the code, right? Like this one, from Include/internal/pystate.h: /* Initialize _PyRuntimeState. Return NULL on success, or return an error message on failure. */ PyAPI_FUNC(_PyInitError) _PyRuntime_Initialize(void); My main concern with maintaining a *separate* documentation of internals is that it would make it harder to keep it in sync with the actual implementation. We often struggle to keep the comments in the code in sync with that code. Yury

On Sep 25, 2018, at 12:09, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
My main concern with maintaining a *separate* documentation of internals is that it would make it harder to keep it in sync with the actual implementation. We often struggle to keep the comments in the code in sync with that code.
Well, my goal is that the internal API would show up when I search for function names on docs.python.org. Right now, I believe the “quick search” box does search the entire documentation suite. I don’t care too much whether they would reside in a separate section in the current C API, or in a separate directory, listed or not under “Parts of the documentation” on the front landing page. But I agree they shouldn’t be intermingled with the public C API. Cheers, -Barry

On Tue, Sep 25, 2018 at 3:27 PM Barry Warsaw <barry@python.org> wrote:
On Sep 25, 2018, at 12:09, Yury Selivanov <yselivanov.ml@gmail.com> wrote:
My main concern with maintaining a *separate* documentation of internals is that it would make it harder to keep it in sync with the actual implementation. We often struggle to keep the comments in the code in sync with that code.
Well, my goal is that the internal API would show up when I search for function names on docs.python.org. Right now, I believe the “quick search” box does search the entire documentation suite. I don’t care too much whether they would reside in a separate section in the current C API, or in a separate directory, listed or not under “Parts of the documentation” on the front landing page. But I agree they shouldn’t be intermingled with the public C API.
An idea: it would be cool to have something like Sphinx autodoc for C headers to pull this documentation from source. Yury

Barry Warsaw writes:
I like the idea of an “internals” C API documentation, separate from the public API.
FWIW, this worked well for XEmacs ("came for flamewars, stayed for the internals manual"). Much of the stuff we inherited from GNU only got documented when there was massive bugginess (yes, that happened ;-), but we were pretty good at getting *something* (if only a "Here be Dragons" in Portuguese) in for most new global identifiers, and that was generally enough.

On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner <vstinner@redhat.com> wrote:
Please don't rely on this ugly API. *By design*, Py_AddPendingCall() tries 100 times to acquire the lock: if it fails to acquire the lock, it does notthing... your callback is ignored...
Yeah, there are issues with pending calls as implemented. Furthermore, I'm not clear on why it was made a public API in the first place. Ultimately I'd like to deprecate Py_AddPendingCall and Py_MakePendingCalls (but that's not my priority right now). Regardless, the underlying machinery matches what I need for interpreter isolation right now. See below.
By the way, recently, we had to fix yet another bug in signal handling. A new function has been added: [snip]
I saw that. If anything, it's more justification for separating signals from the pending calls machinery. :)
If you want to exchange commands between two interpreters, which I see as two threads, I suggest to use two queues and something to consume queues.
Sure. However, to make that work I'd end up with something that's almost identical to the existing pending calls machinery. Since the function-to-run may call Python code, there must be an active PyThreadState and the GIL (or, eventually, target interpreter's lock) must be held. That is what Py_MakePendingCalls gives us already. Injecting the pending call into the eval loop (as we do today) means we don't have to create a new thread just for handling pending calls. [1] In the short term I'd like to stick with existing functionality as much as possible. Doing so has a number of benefits. That's been one of my guiding principles as I've worked toward the multi-core Python goal. That said, I agree that the pending calls machinery has some deficiencies that should be fixed or something better should replace it. I just don't want that to get in the way of short-term goals, especially as I have limited time for this. -eric [1] FWIW, a separate thread for "asynchronous" operations (ones that interrupt the eval loop, e.g. "pending" calls and signals) might be a viable approach now that we require platforms to support threading. The way we interrupt the eval loop currently seems more complex (and inefficient) than necessary. I was thinking about this last week and plan to explore it further at some point in the future. For now, though, I'd like to keep the focus on more immediate needs. :)

On Tue, 25 Sep 2018 09:09:26 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner <vstinner@redhat.com> wrote:
Please don't rely on this ugly API. *By design*, Py_AddPendingCall() tries 100 times to acquire the lock: if it fails to acquire the lock, it does notthing... your callback is ignored...
Yeah, there are issues with pending calls as implemented. Furthermore, I'm not clear on why it was made a public API in the first place.
I don't know, but I think Eve Online used the API at some point (not sure they're still Python-based nowadays). Perhaps Kristján may confirm if he's reading this. Regards Antoine.

I'm afraid Kristjan left CCP some time ago, and may not subscribe to this list any more. Steve Holden On Tue, Sep 25, 2018 at 4:23 PM Antoine Pitrou <solipsis@pitrou.net> wrote:
On Tue, 25 Sep 2018 09:09:26 -0600 Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Tue, Sep 25, 2018 at 1:45 AM Victor Stinner <vstinner@redhat.com> wrote:
Please don't rely on this ugly API. *By design*, Py_AddPendingCall() tries 100 times to acquire the lock: if it fails to acquire the lock, it does notthing... your callback is ignored...
Yeah, there are issues with pending calls as implemented. Furthermore, I'm not clear on why it was made a public API in the first place.
I don't know, but I think Eve Online used the API at some point (not sure they're still Python-based nowadays). Perhaps Kristján may confirm if he's reading this.
Regards
Antoine.
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve%40holdenweb.com

On Mon, Sep 24, 2018 at 1:20 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
On Mon, Sep 24, 2018 at 11:14 AM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
On Fri, Sep 21, 2018 at 7:04 PM Eric Snow <ericsnowcurrently@gmail.com> wrote:
1. Why do we restrict calls to signal.signal() to the main thread? 2. Why must signal handlers run in the main thread? 3. Why does signal handling operate via the "pending calls" machinery and not distinctly?
Here's my take on this:
Handling signals in a multi-threaded program is hard. Some signals can be delivered to an arbitrary thread, some to the one that caused them. Posix provides lots of mechanisms to tune how signals are received (or blocked) by individual threads, but (a) Python doesn't expose those APIs, (b) using those APIs correctly is insanely hard. By restricting that we can only receive signals in the main thread we remove all that complexity.
Just to be clear, I'm *not* suggesting that we allow folks to specify in which Python (or kernel) thread a Python-level signal handler is called.
The reason I've asked about signals is because of the semantics under subinterpreters (where there is no "main" thread). However, I don't have any plans to introduce per-interpreter signal handlers. Mostly I want to understand about the "main" thread restriction for the possible impact on subinterpreters.
FWIW, I'm also mildly curious about the value of the "main" thread restriction currently. From what I can tell the restriction was made early on and there are hints in the C code that it's no longer needed. I suspect we still have the restriction solely because no one has bothered to change it. However, I wasn't sure so I figured I'd ask. :)
We can't change the API of the main thread being where signal handlers are executed by default. If a signal handler raised an exception in a daemon thread, the process would not die when it goes uncaught (ie: why KeyboardInterrupt works). The daemon thread ends and the rest of the process is unaware of that. Many existing Python signal handlers expect to only be called from the main thread. If we wanted to change this, we'd probably want to have users declare which thread(s) are allowed to execute which signal handlers at signal handler registration time and whether they are executed by only one of those threads or by all of those threads. Not semantics I expect most people are used to because I'm not aware of any other language doing that. But I don't see a compelling use case for implementing such complexity. Maybe something like that would make sense for subinterpreter delegation only? I'm not sure. I'd start without signals at all in subinterpreters before making such a decision. Python keeping signals simple has long been cited as a feature of the VM. -gps
Restricting that signal.signal() can only be called from the main thread just makes this API more consistent
Yeah, that's what I thought.
(and also IIRC avoids weird sigaction() behaviour when it is called from different threads within one program).
Is there a good place where this weirdness is documented?
Next, you can only call reentrant functions in your signal handlers. For instance, printf() function isn't safe to use. Therefore one common practice is to set a flag that a signal was received and check it later (exactly what we do with the pending calls machinery).
We don't actually use the pending calls machinery for signals though. The only thing we do is *always* call PyErr_CheckSignals() before making any pending calls. Wouldn't it be equivalent if we called PyErr_CheckSignals() at the beginning of the eval loop right before calling Py_MakePendingCalls()?
This matters to me because I'd like to use "pending" calls for subinterpreters, which means dealing with signals *in* Py_MakePendingCalls() is problematic. Pulling the PyErr_CheckSignals() call out would eliminate that problem.
Therefore, IMO, the current way we handle signals in Python is the safest, most predictable, and most cross-platform option there is. And changing how Python signals API works with threads in any way will actually break the world.
I agree that the way we deal with signals (i.e. set a flag that is later handled in PyErr_CheckSignals(), protected by the GIL) shouldn't change. My original 3 questions do not relate to that. Looks like that wasn't terribly clear. :)
-eric _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/greg%40krypto.org

On Tue, Sep 25, 2018 at 9:30 AM Gregory P. Smith <greg@krypto.org> wrote:
We can't change the API of the main thread being where signal handlers are executed by default.
If a signal handler raised an exception in a daemon thread, the process would not die when it goes uncaught (ie: why KeyboardInterrupt works). The daemon thread ends and the rest of the process is unaware of that. Many existing Python signal handlers expect to only be called from the main thread.
Ah, that's good to know. Thanks, Greg!
If we wanted to change this, we'd probably want to have users declare which thread(s) are allowed to execute which signal handlers at signal handler registration time and whether they are executed by only one of those threads or by all of those threads. Not semantics I expect most people are used to because I'm not aware of any other language doing that. But I don't see a compelling use case for implementing such complexity.
That's similar to what I imagined, based on how signals and posix threads interact. Likewise I consider it not nearly worth doing. :)
Maybe something like that would make sense for subinterpreter delegation only? I'm not sure. I'd start without signals at all in subinterpreters before making such a decision.
Python keeping signals simple has long been cited as a feature of the VM.
Exactly. For now I was planning on keeping signals main-thread-only (consequently main-interpreter-only). There's the possibility we could support per-interpreter signal handlers, but I don't plan on exploring that idea until well after the more important stuff is finished. -eric
participants (11)
-
Antoine Pitrou
-
Barry Warsaw
-
Eric Snow
-
eryk sun
-
Gregory P. Smith
-
Guido van Rossum
-
Nick Coghlan
-
Stephen J. Turnbull
-
Steve Holden
-
Victor Stinner
-
Yury Selivanov