[Python-Dev] Stable buildbots
Trent Nelson
trent at snakebite.org
Tue Nov 23 09:40:50 CET 2010
On 14-Nov-10 3:48 AM, David Bolen wrote:
> This is a completely separate issue, though probably around just as
> long, and like the popup problem its frequency changes over time. By
> "hung" here I'm referring to cases where something must go wrong with
> a test and/or its cleanup such that a python_d process remains
> running, usually several of them at the same time.
My guess: the "hung" (single-threaded) Python process has called
select() without a timeout in order to wait for some data. However, the
data never arrives (due to a broken/failed test), and the select() never
returns.
On Windows, processes seem harder to kill when they get into this state.
If I purposely wedge a Windows process via select() via the
interactive interpreter, ctrl-c has absolutely no effect (whereas on
Unix, ctrl-c will interrupt the select()).
As for why kill_python.exe doesn't seem to be able to kill said wedged
processes, the MSDN documentation on TerminateProcess[1] states the
following:
The terminated process cannot exit until all
pending I/O has been completed or canceled. (sic)
It's not unreasonable to assume a wedged select() constitutes pending
I/O, so that's a possible explanation as to why kill_python.exe isn't
able to terminate the processes.
(Also, kill_python currently assumes TerminateProcess() always works;
perhaps this optimism is misplaced. Also note the XXX TODO regarding
the fact that we don't kill processes that have loaded our python*.dll,
but may not be named python_d.exe. I don't think that's the issue here,
though.)
On 14-Nov-10 5:32 AM, David Bolen wrote:
> "Martin v. Löwis"<martin at v.loewis.de> writes:
>
>> This is what kill_python.exe is supposed to solve. So I recommend to
>> investigate why it fails to kill the hanging Pythons.
>
> Yeah, I know, and I can't say I disagree in principle - not sure why
> Windows doesn't let the kill in that module work (or if there's an
> issue actually running it under all conditions).
>
> At the moment though, I do know that using the sysinternals pskill
> utility externally (which is what I currently do interactively)
> definitely works so to be honest,
That's interesting. (That kill_python.exe doesn't kill the wedged
processes, but pskill does.) kill_python is pretty simple, it just
calls TerminateProcess() after acquiring a handle with the relevant
PROCESS_TERMINATE access right. That being said, that's the recommended
way to kill a process -- I doubt pskill would be going about it any
differently (although, it is sysinternals... you never know what kind of
crazy black magic it's doing behind the scenes).
Are you calling pskill with the -t flag? i.e. kill process and all
dependents? That might be the ticket, especially if killing the child
process that wedged select() is waiting on causes it to return, and
thus, makes it killable.
Otherwise, if it happens again, can you try kill_python.exe first, then
pskill, and confirm if the former fails but the latter succeeds?
Trent.
[1]: http://msdn.microsoft.com/en-us/library/ms686714(VS.85).aspx
More information about the Python-Dev
mailing list