Portable way to tell if a process is still alive
Laszlo Nagy
gandalf at shopzeus.com
Thu Jan 28 05:50:16 EST 2010
>> Suppose we have a program that writes its process id into a pid file.
>> Usually the program deletes the pid file when it exists... But in some
>> cases (for example, killed with kill -9 or TerminateProcess) pid file is
>> left there. I would like to know if a process (given with its process
>> id) is still running or not. I know that this can be done with OS
>> specific calls. But that is not portable. It can also be done by
>> executing "ps -p 23423" with subprocess module, but that is not portable
>> either. Is there a portable way to do this?
>>
>> If not, would it be a good idea to implement this (I think very
>> primitive) function in the os module?
>>
>
> Not only is there no way to do it portably, there is no way to do it
> reliably for the general case. The problem is that processes do not have
> unique identifiers. A PID only uniquely identifies a running process; once
> the process terminates, its PID becomes available for re-use.
>
Non-general case: the process is a service and only one instance should
be running. There could be a pid file left on the disk. It is possible
to e.g. mount procfs, and check if the given PID belongs to a command
line / executed program that is in question. It cannot be guaranteed
that a service will always delete its pid file when it exists. It
happens for example, somebody kills it with "kill -9" or exits on signal
11 etc. It actually did happened to me, and then the service could not
be restarted because the PID file was there. (It is an error to run two
instances of the same service, but it is also an error to not run
it....) Whan I would like to do upon startup is to check if the process
is already running. This way I could create a "guardian" that checks
other services, and (re)starts them if they stopped working.
And no, it is not a solution to write "good" a service that will never
stop, because:
1. It is particulary not possible in my case - there is a software bug
in a third party library that causes my service exit on various wreid
signals.
2. It is not possible anyway. There are users killing processes
accidentally, and other unforeseen bugs.
3. In a mission critical environment, I would use a guardian even if
guarded services are not likely to stop
I understand that this is a whole different question now, and possibly
there is no portable way to do it. Just I wonder if there are others
facing a similar problem here. Any thoughs or comments - is it bad that
I would like to achieve? Is there a better approach?
Thanks,
Laszlo
More information about the Python-list
mailing list