[Python-Dev] popen2.py strangeness

Sjoerd Mullender sjoerd@acm.org
Mon, 02 Jun 2003 21:07:39 +0200


On Mon, Jun 2 2003 Guido van Rossum wrote:

> [Sjoerd]
> > > > I just noticed the following behavior (current CVS):
> > > > 
> > > > + python
> > > > Python 2.3b1+ (#37, Jun  2 2003, 11:34:36)
> > > > [GCC 3.2 20020903 (Red Hat Linux 8.0 3.2-7)] on linux2
> > > > Type "help", "copyright", "credits" or "license" for more information.
> > > > >>> import popen2
> > > > >>> a = popen2.Popen3('sleep 1', 1, -1)
> > > > >>> # wait more than one second to give the process time to exit
> > > > ...
> > > > >>> b = popen2.Popen3('sleep 1', 1, -1)
> > > > >>> a.wait()
> > > > Traceback (most recent call last):
> > > >   File "<stdin>", line 1, in ?
> > > >   File "/ufs/sjoerd/src/python/Lib/popen2.py", line 86, in wait
> > > >     pid, sts = os.waitpid(self.pid, 0)
> > > > OSError: [Errno 10] No child processes
> > > > >>>
> > > > 
> > > > I would expect the a.wait() to return the exit status (or None) of the
> > > > first "sleep 1" command, but instead I get a traceback.
> > > > 
> > > > The reason is clear: in popen2.Popen3.__init__ the very first thing that
> > > > happens is a call to _cleanup() which does a poll() call on all active
> > > > instances.  If the process has died, the os.waitpid() call that poll()
> > > > does removes it from the OS's process table, so that a next os.waitpid()
> > > > on the same process id will fail.
> > > > 
> > > > I understand why the call to _cleanup() happens: defunct processes need
> > > > to be cleaned up, and if you only ever use the popen[234] factory
> > > > functions, the library needs to call os.waitpid for the created
> > > > processes.
> > > > 
> > > > However, it seems to me the above sequence of events is legitimate and
> > > > should be supported.  The question is: how?
> 
> [me]
> > > Call os.waitpid() with an explicit pid argument?
> 
> [Sjoerd]
> > popen2 does that, that's not the problem.  The problem is, after the
> > second call to popen2.Popen3 or popen2.Popen4 (possibly through the use
> > of the factory functions) the call to os.waitpid with explicit pid is
> > already done, so there is then no way to get the exit status anymore.
> > You can also not call os.waitpid (possibly through the poll() method)
> > just before calling popen2.Popen[34] because of race conditions (the
> > process might end just after your explicit call and just before the
> > implicit call in popen2.Popen[34]).
> 
> Hm, it looks like wait() should add an "if self.sts < 0" around most
> of its body, like poll().  In particular, this patch solves the
> problem for me:
> 
> Index: popen2.py
> ===================================================================
> RCS file: /cvsroot/python/python/dist/src/Lib/popen2.py,v
> retrieving revision 1.25
> diff -c -c -r1.25 popen2.py
> *** popen2.py	3 Jun 2002 15:58:31 -0000	1.25
> --- popen2.py	2 Jun 2003 17:24:53 -0000
> ***************
> *** 83,92 ****
>   
>       def wait(self):
>           """Wait for and return the exit status of the child process."""
> !         pid, sts = os.waitpid(self.pid, 0)
> !         if pid == self.pid:
> !             self.sts = sts
> !             _active.remove(self)
>           return self.sts
>   
>   
> --- 83,93 ----
>   
>       def wait(self):
>           """Wait for and return the exit status of the child process."""
> !         if self.sts < 0:
> !             pid, sts = os.waitpid(self.pid, 0)
> !             if pid == self.pid:
> !                 self.sts = sts
> !                 _active.remove(self)
>           return self.sts
>   
>   
> 
> Should I check it in, or is there more to it?

It looks like this does the trick, so yes, check it in.

Is this still in time for 2.2.3?  Is it a problem in the 2.2 branch?  I
don't have it available at the moment.

-- Sjoerd Mullender <sjoerd@acm.org>