Python subprocesses experience mysterious delay in receiving stdin EOF
Yang Zhang
yanghatespam at gmail.com
Wed Feb 9 21:37:52 EST 2011
On Thu, Feb 10, 2011 at 12:28 AM, Jean-Michel Pichavant
<jeanmichel at sequans.com> wrote:
> Yang Zhang wrote:
>>
>> On Wed, Feb 9, 2011 at 11:01 AM, MRAB <python at mrabarnett.plus.com> wrote:
>>
>>>
>>> On 09/02/2011 01:59, Yang Zhang wrote:
>>>
>>>>
>>>> I reduced a problem I was seeing in my application down into the
>>>> following test case. In this code, a parent process concurrently
>>>> spawns 2 (you can spawn more) subprocesses that read a big message
>>>> from the parent over stdin, sleep for 5 seconds, and write something
>>>> back. However, there's unexpected waiting happening somewhere, causing
>>>> the code to complete in 10 seconds instead of the expected 5.
>>>>
>>>> If you set `verbose=True`, you can see that the straggling subprocess
>>>> is receiving most of the messages, then waiting for the last chunk of
>>>> 3 chars---it's not detecting that the pipe has been closed.
>>>> Furthermore, if I simply don't do anything with the second process
>>>> (`doreturn=True`), the first process will *never* see the EOF.
>>>>
>>>> Any ideas what's happening? Further down is some example output.
>>>> Thanks in advance.
>>>>
>>>> from subprocess import *
>>>> from threading import *
>>>> from time import *
>>>> from traceback import *
>>>> import sys
>>>> verbose = False
>>>> doreturn = False
>>>> msg = (20*4096+3)*'a'
>>>> def elapsed(): return '%7.3f' % (time() - start)
>>>> if sys.argv[1:]:
>>>> start = float(sys.argv[2])
>>>> if verbose:
>>>> for chunk in iter(lambda: sys.stdin.read(4096), ''):
>>>> print>> sys.stderr, '..', time(), sys.argv[1], 'read',
>>>> len(chunk)
>>>> else:
>>>> sys.stdin.read()
>>>> print>> sys.stderr, elapsed(), '..', sys.argv[1], 'done reading'
>>>> sleep(5)
>>>> print msg
>>>> else:
>>>> start = time()
>>>> def go(i):
>>>> print elapsed(), i, 'starting'
>>>> p = Popen(['python','stuckproc.py',str(i), str(start)],
>>>> stdin=PIPE, stdout=PIPE)
>>>> if doreturn and i == 1: return
>>>> print elapsed(), i, 'writing'
>>>> p.stdin.write(msg)
>>>> print elapsed(), i, 'closing'
>>>> p.stdin.close()
>>>> print elapsed(), i, 'reading'
>>>> p.stdout.read()
>>>> print elapsed(), i, 'done'
>>>> ts = [Thread(target=go, args=(i,)) for i in xrange(2)]
>>>> for t in ts: t.start()
>>>> for t in ts: t.join()
>>>>
>>>> Example output:
>>>>
>>>> 0.001 0 starting
>>>> 0.003 1 starting
>>>> 0.005 0 writing
>>>> 0.016 1 writing
>>>> 0.093 0 closing
>>>> 0.093 0 reading
>>>> 0.094 1 closing
>>>> 0.094 1 reading
>>>> 0.098 .. 1 done reading
>>>> 5.103 1 done
>>>> 5.108 .. 0 done reading
>>>> 10.113 0 done
>>>>
>>>>
>>>
>>> I changed 'python' to the path of python.exe and 'stuckproc.py' to its
>>> full path and tried it with Python 2.7 on Windows XP Pro. It worked as
>>> expected.
>>>
>>
>> Good point - I didn't specify that I'm seeing this on Linux (Ubuntu
>> 10.04's Python 2.6).
>>
>>
>
> python test.py 0.000 0 starting
> 0.026 0 writing
> 0.026 0 closing
> 0.026 0 reading
> 0.029 .. 0 done reading
> 0.030 1 starting
> 0.038 1 writing
> 0.058 1 closing
> 0.058 1 reading
> 0.061 .. 1 done reading
> 5.026 0 done
> 5.061 1 done
>
> on debian lenny (Python 2.5)
>
> JM
>
FWIW, this is consistently reproduce-able across all the Ubuntu 10.04s
I've tried. You may need to increase the message size so that it's
large enough for your system.
--
Yang Zhang
http://yz.mit.edu/
More information about the Python-list
mailing list