... Reading through more of the code, I realized that I greatly underestimated the number of interruptible operations.
That said, the meta-question still applies: Are there things which are generally intended *not* to be interruptible by signals, and if so, is there some consistent way of indicating this?
On Wed, Jun 24, 2020 at 2:34 PM Yonatan Zunger firstname.lastname@example.org wrote:
I'm in the process of writing some code to defer signals during critical regions, which has involved a good deal of reading through the CPython implementation to understand the behaviors. Something I've found is that there appears to be a lot of thoughtfulness about where the signal handlers can be triggered, but this thoughtfulness is largely undocumented. I've put together a working list of behaviors from staring at the code, but what I'd like to figure out is which of these behaviors the devs think of as intended to be invariants, versus which are just accidents of how the code currently works and might change unpredictably.
And if there are things which are intended to be genuine invariants, would it be reasonable to document these formally and make them part of the language, not just for inside the CPython codebase?
What appears to be true is this:
- Signal handlers are only invoked in the main thread (documented with
the signal library)
- High-level: Signal handlers may be invoked at any instruction
boundary. External C libraries *may* invoke them as well, but there are no general guarantees. (Documented with the signal library)
- Low-level: Certain functions can be described as "interruptable,"
and signal handlers may be invoked whenever these functions are called.
- Signal handlers are thus partially reentrant: a signal handler may
be interrupted by another signal iff it invokes an interruptable function.
In particular, the thing whose intentionality I'm not sure about is whether the notion of an interruptable function or instruction is meant to be an actual property of the language and/or of the CPython runtime, or whether it's actually intended that only the "high-level" rule above be true, and that all signal handlers should be considered to be fully reentrant at all times. The comments in sysmodule.c about avoiding triggering PyErr_CheckSignals() suggest that there definitely is some thinking about this within the CPython code itself.
The reason it would be useful to document this is so that if I'm trying to write a fairly generic library that handles signals (like the one I'm doing now) I can reason about where I need to be defensive about an instruction being interrupted by yet another signal, and maybe avoid calls to certain functions which are known to be interruptable, much like I would avoid calling malloc() in a C signal handler.
In the current implementation, the interruptable functions and instructions are:
- Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
(Catalogue needs to be made of these -- it's a much smaller set than the set of all calls to PyErr_SetFromErrno)
- Basically any open, read, or write method of a raw or buffered file
- Likewise, any open, read, or write method on a socket.
- In any interactive console readline, or in input().
- object.__str__, object.__repr__, and PyObject_Print, and anything
that falls back to these.
- Multiplication, division, or stringification of long integers.
More specific functions:
- In `multiprocessing.shared_memory`, SharedMemory.__init__, .close,
- In `multiprocessing.semaphore`, Semaphore.acquire. (But
interestingly, *not* threading.Semaphore.acquire)
- In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
- In `fcntl`, fcntl and ioctl.
- In `traceback`, any of the print methods.
- In `faulthandler`, dump_traceback
- In `select`, all of the methods. (select, epoll, etc)
- In `time`, sleep.
- In `curses`, whenever you look for key input.
- In `tkinter`, during the main loop of a Tcl/Tk app.
- During an SSL handshake.
Distinguished Engineer and Chief Ethics Officer
He / Him
100 View St, Suite 101
Mountain View, CA 94041