Threading, atexit, and logging
In bug #1566280 somebody reported that he gets an exception where the logging module tries to write to closed file descriptor. Upon investigation, it turns out that the file descriptor is closed because the logging atexit handler is invoked. This is surprising, as the program is far from exiting at this point. Investigating further, I found that the logging atexit handler is registered *after* the threading atexit handler, so it gets invoked *before* the threading's atexit. Now, threading's atexit is the one that keeps the application running, by waiting for all non-daemon threads to shut down. As this application does all its work in non-daemon threads, it keeps running for quite a while - except that the logging module gives errors. The real problem here is that atexit handlers are invoked even though the program logically doesn't exit, yet (it's not just that the threading atexit is invoked after logging atexit - this could happen to any atexit handler that gets registered). I added a patch to this report which makes the MainThread __exitfunc a sys.exitfunc, chaining what is there already. This will work fine for atexit (as atexit is still imported explicitly to register its sys.exitfunc), but it might break if other applications still insist on installing a sys.exitfunc. What do you think about this approach? Regards, Martin P.S. There is another issue reported about the interpreter crashing; I haven't been able to reproduce this, yet.
[Martin v. Löwis]
In bug #1566280 somebody reported that he gets an exception where the logging module tries to write to closed file descriptor.
Upon investigation, it turns out that the file descriptor is closed because the logging atexit handler is invoked. This is surprising, as the program is far from exiting at this point.
But the main thread is done, right? None of this appears to make sense unless we got into Py_Finalize(), and that doesn't happen until the main thread has nothing left to do.
Investigating further, I found that the logging atexit handler is registered *after* the threading atexit handler, so it gets invoked *before* the threading's atexit.
Ya, and that sucks. Can't recall details now, but it's not the first time the vagaries of atexit ordering bit a threaded program. IMO, `threading` shouldn't use atexit at all.
Now, threading's atexit is the one that keeps the application running, by waiting for all non-daemon threads to shut down. As this application does all its work in non-daemon threads, it keeps running for quite a while - except that the logging module gives errors.
The real problem here is that atexit handlers are invoked even though the program logically doesn't exit, yet (it's not just that the threading atexit is invoked after logging atexit - this could happen to any atexit handler that gets registered).
I added a patch to this report which makes the MainThread __exitfunc a sys.exitfunc, chaining what is there already. This will work fine for atexit (as atexit is still imported explicitly to register its sys.exitfunc), but it might break if other applications still insist on installing a sys.exitfunc.
Well, that's been officially deprecated since 2.4, but who knows?
What do you think about this approach?
It's expedient :-) So was using atexit for this to begin with. Probably "good enough". I'd rather, e.g., that `threading` stuff an exit function into a module global, and change Py_Finalize() to look for that and run it (if present) before invoking call_sys_exitfunc(). That is, break all connection between the core's implementation of threading and the user-visible `atexit` machinery. `atexit` is a hack specific to "don't care about order" finalization functions, and it gets increasingly obscure to try to force it to respect a specific ordering sometimes (e.g., now you have a patch to try to fix it by relying on an obscure deprecated feature and hoping users don't screw with that too -- probably "good enough", but still sucky).
Tim Peters schrieb:
Upon investigation, it turns out that the file descriptor is closed because the logging atexit handler is invoked. This is surprising, as the program is far from exiting at this point.
But the main thread is done, right?
Wrong. main.py (which is the __main__ script in the demo code) is done, yes. However, threading.py has machinery to not terminate the main thread as long as there are non-daemon threads. In that sense, the main thread is not done: it still has to .join() all the other threads (rather, they join the main thread).
Ya, and that sucks. Can't recall details now, but it's not the first time the vagaries of atexit ordering bit a threaded program.
In this case, logging/__init__.py imports threading, so that will register its atexit first - even if the application imports logging before threading.
IMO, `threading` shouldn't use atexit at all.
That is (in a way) my proposal (although I suggest to use sys.exitfunc instead).
It's expedient :-) So was using atexit for this to begin with. Probably "good enough". I'd rather, e.g., that `threading` stuff an exit function into a module global, and change Py_Finalize() to look for that and run it (if present) before invoking call_sys_exitfunc().
Ok, that's what I'll do then. Yet another alternative would be to have the "daemonic" thread feature in the thread module itself (along with keeping track of a list of all running non-daemonic thread). Regards, Martin
[Martin v. Löwis]
Upon investigation, it turns out that the file descriptor is closed because the logging atexit handler is invoked. This is surprising, as the program is far from exiting at this point.
[Tim Peters]
But the main thread is done, right?
[Martin]
Wrong. main.py (which is the __main__ script in the demo code) is done, yes.
Fine, but the main thread /has/ entered Py_Finalize(). That's key here, and wasn't clear originally.
However, threading.py has machinery to not terminate the main thread as long as there are non-daemon threads.
Right. ...
IMO, `threading` shouldn't use atexit at all.
That is (in a way) my proposal (although I suggest to use sys.exitfunc instead).
Same thing to me. I'd rather thread cleanup, which is part of the Python core, not rely on any of the user-visible (hence also user-screwable) "do something at shutdown" gimmicks. Thread cleanup is only vaguely related to that concept because "cleanup" here implies waiting for an arbitrarily long time until all thread threads decide on their own to quit. That's not something to be cleaned up /at/ shutdown time, it's waiting (potentially forever!) /for/ shutdown time, and that mismatch is really the source of the problem.
It's expedient :-) So was using atexit for this to begin with. Probably "good enough". I'd rather, e.g., that `threading` stuff an exit function into a module global, and change Py_Finalize() to look for that and run it (if present) before invoking call_sys_exitfunc().
Ok, that's what I'll do then.
Yet another alternative would be to have the "daemonic" thread feature in the thread module itself (along with keeping track of a list of all running non-daemonic thread).
Sorry, I couldn't follow the intent there. Not obvious to me how moving this stuff from `threading` into `thread` would make it easier(?) for the implementation to wait for non-daemon threads to finish.
Tim Peters schrieb:
Sorry, I couldn't follow the intent there. Not obvious to me how moving this stuff from `threading` into `thread` would make it easier(?) for the implementation to wait for non-daemon threads to finish.
Currently, if you create a thread through the thread module (rather than threading), it won't participate in the "wait until all threads are done" algorithm - you have to use the threading module. Moving it into the thread module would allow to cover all threads. Also, if the interpreter invokes, say, threading._shutdown(): that's also "user-screwable", as a user may put something else into threading._shutdown. To make it non-visible, it has to be in C, not Python (and even then it might be still visible to extension modules). Regards, Martin
[Tim Peters]
Sorry, I couldn't follow the intent there. Not obvious to me how moving this stuff from `threading` into `thread` would make it easier(?) for the implementation to wait for non-daemon threads to finish.
[Martin v. Löwis]
Currently, if you create a thread through the thread module (rather than threading), it won't participate in the "wait until all threads are done" algorithm - you have to use the threading module. Moving it into the thread module would allow to cover all threads.
True, but that doesn't appear to have any bearing on the bug originally discussed. You introduced this as "yet another alternative" in the context of how to address the original complaint, but if that was the intent, I still don't get it. Regardless, I personally view the `thread` module as being "de facto" deprecated. If someone /wants/ the ability to create a non-daemon thread, that the ability is only available via `threading` is an incentive to move to the newer, saner module. Besides, if the daemon distinction were grafted on to `thread` threads too, it would have to default to daemonic (a different default than `threading` threads) else be incompatible with current `thread` thread behavior. I personally don't want to add new features to `thread` threads in any case.
Also, if the interpreter invokes, say, threading._shutdown(): that's also "user-screwable", as a user may put something else into threading._shutdown. To make it non-visible, it has to be in C, not Python (and even then it might be still visible to extension modules).
The leading underscore makes it officially invisible <0.7 wink>, and users would have to both seek it out and go out of their way to screw it. If some user believes they have a need to mess wtih threading._shutdown(), that's fine by me too. The problem with atexit and sys.exitfunc is that users can get in trouble now simply by using them in documented ways, because `threading` also uses them (but inappropriately so, IMO). Indeed, that's all the `logging` module did here. While this next is also irrelevant to the original complaint, I think it was also a minor mistake to build the newer atexit gimmicks on top of sys.exitfunc (same reason: a user can destroy the atexit functionality quite innocently if they happen to use sys.exitfunc after they (or code they build on) happens to import `atexit`.
Martin v. Löwis wrote:
Also, if the interpreter invokes, say, threading._shutdown(): that's also "user-screwable", as a user may put something else into threading._shutdown.
Although it would require being somewhat more deliberate, since threading._shutdown clearly has something to do with threading, whereas the atexit mechanism could get screwed up by someone who wasn't even thinking about what effect it might have on threading. -- Greg Ewing, Computer Science Dept, +--------------------------------------+ University of Canterbury, | Carpe post meridiem! | Christchurch, New Zealand | (I'm not a morning person.) | greg.ewing@canterbury.ac.nz +--------------------------------------+
participants (3)
-
"Martin v. Löwis"
-
Greg Ewing
-
Tim Peters