[Python-Dev] PEP 3145 (With Contents)

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Tue Sep 15 20:24:56 CEST 2009


On 04:25 pm, eric.pruitt at gmail.com wrote:
>I'm bumping this PEP again in hopes of getting some feedback.
>
>Thanks,
>Eric
>
>On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <eric.pruitt at gmail.com> 
>wrote:
>>PEP: 3145
>>Title: Asynchronous I/O For subprocess.Popen
>>Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
>>Type: Standards Track
>>Content-Type: text/plain
>>Created: 04-Aug-2009
>>Python-Version: 3.2
>>
>>Abstract:
>>
>>   In its present form, the subprocess.Popen implementation is prone 
>>to
>>   dead-locking and blocking of the parent Python script while waiting 
>>on data
>>   from the child process.
>>
>>Motivation:
>>
>>   A search for "python asynchronous subprocess" will turn up numerous
>>   accounts of people wanting to execute a child process and 
>>communicate with
>>   it from time to time reading only the data that is available 
>>instead of
>>   blocking to wait for the program to produce data [1] [2] [3].  The 
>>current
>>   behavior of the subprocess module is that when a user sends or 
>>receives
>>   data via the stdin, stderr and stdout file objects, dead locks are 
>>common
>>   and documented [4] [5].  While communicate can be used to alleviate 
>>some of
>>   the buffering issues, it will still cause the parent process to 
>>block while
>>   attempting to read data when none is available to be read from the 
>>child
>>   process.
>>
>>Rationale:
>>
>>   There is a documented need for asynchronous, non-blocking 
>>functionality in
>>   subprocess.Popen [6] [7] [2] [3].  Inclusion of the code would 
>>improve the
>>   utility of the Python standard library that can be used on Unix 
>>based and
>>   Windows builds of Python.  Practically every I/O object in Python 
>>has a
>>   file-like wrapper of some sort.  Sockets already act as such and 
>>for
>>   strings there is StringIO.  Popen can be made to act like a file by 
>>simply
>>   using the methods attached the the subprocess.Popen.stderr, stdout 
>>and
>>   stdin file-like objects.  But when using the read and write methods 
>>of
>>   those options, you do not have the benefit of asynchronous I/O.  In 
>>the
>>   proposed solution the wrapper wraps the asynchronous methods to 
>>mimic a
>>   file object.
>>
>>Reference Implementation:
>>
>>   I have been maintaining a Google Code repository that contains all 
>>of my
>>   changes including tests and documentation [9] as well as blog 
>>detailing
>>   the problems I have come across in the development process [10].
>>
>>   I have been working on implementing non-blocking asynchronous I/O 
>>in the
>>   subprocess.Popen module as well as a wrapper class for 
>>subprocess.Popen
>>   that makes it so that an executed process can take the place of a 
>>file by
>>   duplicating all of the methods and attributes that file objects 
>>have.

"Non-blocking" and "asynchronous" are actually two different things. 
 From the rest of this PEP, I think only a non-blocking API is being 
introduced.  I haven't looked beyond the PEP, though, so I might be 
missing something.
>>   There are two base functions that have been added to the 
>>subprocess.Popen
>>   class: Popen.send and Popen._recv, each with two separate 
>>implementations,
>>   one for Windows and one for Unix based systems.  The Windows
>>   implementation uses ctypes to access the functions needed to 
>>control pipes
>>   in the kernel 32 DLL in an asynchronous manner.  On Unix based 
>>systems,
>>   the Python interface for file control serves the same purpose.  The
>>   different implementations of Popen.send and Popen._recv have 
>>identical
>>   arguments to make code that uses these functions work across 
>>multiple
>>   platforms.

Why does the method for non-blocking read from a pipe start with an "_"? 
This is the convention (widely used) for a private API.  The name also 
doesn't suggest that this is the non-blocking version of reading. 
Similarly, the name "send" doesn't suggest that this is the non-blocking 
version of writing.
>>   When calling the Popen._recv function, it requires the pipe name be
>>   passed as an argument so there exists the Popen.recv function that 
>>passes
>>   selects stdout as the pipe for Popen._recv by default. 
>> Popen.recv_err
>>   selects stderr as the pipe by default. "Popen.recv" and 
>>"Popen.recv_err"
>>   are much easier to read and understand than "Popen._recv('stdout' 
>>..." and
>>   "Popen._recv('stderr' ..." respectively.

What about reading from other file descriptors?  subprocess.Popen allows 
arbitrary file descriptors to be used.  Is there any provision here for 
reading and writing non-blocking from or to those?
>>   Since the Popen._recv function does not wait on data to be produced
>>   before returning a value, it may return empty bytes. 
>> Popen.asyncread
>>   handles this issue by returning all data read over a given time
>>   interval.

Oh.  Popen.asyncread?   What's that?  This is the first time the PEP 
mentions it.
>>   The ProcessIOWrapper class uses the asyncread and asyncwrite 
>>functions to
>>   allow a process to act like a file so that there are no blocking 
>>issues
>>   that can arise from using the stdout and stdin file objects 
>>produced from
>>   a subprocess.Popen call.

What's the ProcessIOWrapper class?  And what's the asyncwrite function? 
Again, this is the first time it's mentioned.

So, to sum up, I think my main comment is that the PEP seems to be 
missing a significant portion of the details of what it's actually 
proposing.  I suspect that this information is present in the 
implementation, which I have not looked at, but it probably belongs in 
the PEP.

Jean-Paul


More information about the Python-Dev mailing list