[Python-Dev] Status of thread cancellation
nmm1 at cus.cam.ac.uk
Mon Mar 19 17:38:12 CET 2007
Jon Ribbens <jon+python-dev at unequivocal.co.uk> wrote:
> Can you elaborate on this? You can get zombie entries in the process
> table if nobody's called 'wait()' on them, and you can (extremely
> rarely) get unkillable process in 'disk-wait' state (usually due to
> hardware failure or a kernel bug, I suspect), but I've never heard
> of a process on a Unix-like system being unkillable due to something
> to do with sockets (or any other kind of file descriptor for that
> matter). How could a socket be 'jammed'? What does that even mean?
Well, I have seen it hundreds of times on a dozen different Unices;
it is very common. You don't always SEE the stuck process - sometimes
the 'kill -9' causes the pid to become invisible to ps etc., and
just occasionally it can continue to use CPU until the system is
rebooted. That is rare, however, and it normally just hangs onto
locks, memory and other such resources. Very often its vampiric
status is visible only because such things haven't been freed,
or when you poke through kernel structures.
Sockets get jammed because they are used to connect to subprocesses
or kernel threads, which in turn access unreliable I/O devices. If
there is a glitch on the device, the error recovery very often fails
to work, cleanly, and may wait for an event that will never occur
or go into a loop (usually a sleep/poll loop). Typically, a HIGHER
level then times out the failing error recovery, so that the normal
programmer doesn't notice. But it very often fails to kill the
lower level code.
As far as applications are concerned, a jammed socket is one where
the higher level recovery has NOT done that, and is waiting for the
lower level to complete - which it isn't going to do!
The other effect that ordinary programmers notice is a system very
gradually starting to run down after days/weeks/months of continual
operation. The state is cleared by rebooting.
More information about the Python-Dev