[Python-Dev] PEP 3145 (With Contents)

Gregory P. Smith greg at krypto.org
Sun Dec 9 05:17:52 CET 2012


I'm really not sure what this PEP is trying to get at given that it
contains no examples and sounds from the descriptions to be adding a
complicated api on top of something that already, IMNSHO, has too much it
(subprocess.Popen).

Regardless, any user can use the stdout/err/in file objects with their own
code that handles them asynchronously (yes that can be painful but that is
what is required for _any_ socket or pipe I/O you don't want to block on).

It *sounds* to me like this entire PEP could be written and released as a
third party module on PyPI that offers a subprocess.Popen subclass adding
some more convenient non-blocking APIs.  That's where I'd start if I were
interested in this as a future feature.

-gps



On Fri, Dec 7, 2012 at 5:10 PM, anatoly techtonik <techtonik at gmail.com>wrote:

> On Tue, Sep 15, 2009 at 9:24 PM, <exarkun at twistedmatrix.com> wrote:
>
>> On 04:25 pm, eric.pruitt at gmail.com wrote:
>>
>>> I'm bumping this PEP again in hopes of getting some feedback.
>>>
>>
> This is useful, indeed. ActiveState recipe for this has 10 votes, which is
> high for ActiveState (and such hardcore topic FWIW).
>
>
>> On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <eric.pruitt at gmail.com> wrote:
>>>
>>>> PEP: 3145
>>>> Title: Asynchronous I/O For subprocess.Popen
>>>> Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
>>>> Type: Standards Track
>>>> Content-Type: text/plain
>>>> Created: 04-Aug-2009
>>>> Python-Version: 3.2
>>>>
>>>> Abstract:
>>>>
>>>>    In its present form, the subprocess.Popen implementation is prone to
>>>>    dead-locking and blocking of the parent Python script while waiting
>>>> on data
>>>>    from the child process.
>>>>
>>>> Motivation:
>>>>
>>>>    A search for "python asynchronous subprocess" will turn up numerous
>>>>    accounts of people wanting to execute a child process and
>>>> communicate with
>>>>    it from time to time reading only the data that is available instead
>>>> of
>>>>    blocking to wait for the program to produce data [1] [2] [3].  The
>>>> current
>>>>    behavior of the subprocess module is that when a user sends or
>>>> receives
>>>>    data via the stdin, stderr and stdout file objects, dead locks are
>>>> common
>>>>    and documented [4] [5].  While communicate can be used to alleviate
>>>> some of
>>>>    the buffering issues, it will still cause the parent process to
>>>> block while
>>>>    attempting to read data when none is available to be read from the
>>>> child
>>>>    process.
>>>>
>>>> Rationale:
>>>>
>>>>    There is a documented need for asynchronous, non-blocking
>>>> functionality in
>>>>    subprocess.Popen [6] [7] [2] [3].  Inclusion of the code would
>>>> improve the
>>>>    utility of the Python standard library that can be used on Unix
>>>> based and
>>>>    Windows builds of Python.  Practically every I/O object in Python
>>>> has a
>>>>    file-like wrapper of some sort.  Sockets already act as such and for
>>>>    strings there is StringIO.  Popen can be made to act like a file by
>>>> simply
>>>>    using the methods attached the the subprocess.Popen.stderr, stdout
>>>> and
>>>>    stdin file-like objects.  But when using the read and write methods
>>>> of
>>>>    those options, you do not have the benefit of asynchronous I/O.  In
>>>> the
>>>>    proposed solution the wrapper wraps the asynchronous methods to
>>>> mimic a
>>>>    file object.
>>>>
>>>> Reference Implementation:
>>>>
>>>>    I have been maintaining a Google Code repository that contains all
>>>> of my
>>>>    changes including tests and documentation [9] as well as blog
>>>> detailing
>>>>    the problems I have come across in the development process [10].
>>>>
>>>>    I have been working on implementing non-blocking asynchronous I/O in
>>>> the
>>>>    subprocess.Popen module as well as a wrapper class for
>>>> subprocess.Popen
>>>>    that makes it so that an executed process can take the place of a
>>>> file by
>>>>    duplicating all of the methods and attributes that file objects have.
>>>>
>>>
>> "Non-blocking" and "asynchronous" are actually two different things. From
>> the rest of this PEP, I think only a non-blocking API is being introduced.
>>  I haven't looked beyond the PEP, though, so I might be missing something.
>
>
> I suggest renaming http://www.python.org/dev/peps/pep-3145/ to
> 'Non-blocking I/O for subprocess' and continue. IMHO on this stage is where
> examples with deadlocks that occur with current subprocess
> implementation are badly needed.
>
>     There are two base functions that have been added to the
>>>> subprocess.Popen
>>>>    class: Popen.send and Popen._recv, each with two separate
>>>> implementations,
>>>>    one for Windows and one for Unix based systems.  The Windows
>>>>    implementation uses ctypes to access the functions needed to control
>>>> pipes
>>>>    in the kernel 32 DLL in an asynchronous manner.  On Unix based
>>>> systems,
>>>>    the Python interface for file control serves the same purpose.  The
>>>>    different implementations of Popen.send and Popen._recv have
>>>> identical
>>>>    arguments to make code that uses these functions work across multiple
>>>>    platforms.
>>>>
>>>
>> Why does the method for non-blocking read from a pipe start with an "_"?
>> This is the convention (widely used) for a private API.  The name also
>> doesn't suggest that this is the non-blocking version of reading.
>> Similarly, the name "send" doesn't suggest that this is the non-blocking
>> version of writing.
>
>
> The implementation is based on http://code.activestate.com/recipes/440554/which is more clearly illustrates integrated functionality.
>
> _recv() is a private base function, which is takes stdout or stderr as
> parameter. Corresponding user-level functions to read from stdout and
> stderr are .recv() and .recv_err()
>
> I thought about renaming API to .asyncread() and .asyncwrite(), but that
> may mean that you call method and then result asynchronously start to fill
> some buffer, which is not the case here.
>
> Then I thought about .check_read() and .check_write(), literally meaning
> 'check and read' or 'check and return' for non-blocking calls if there is
> nothing. But then again, poor naming convention of subprocess uses
> .check_output() for blocking read until command completes.
>
> Currently, subversion doesn't have .read and .write methods. It may be the
> best option:
>   .write(what)  to pipe more stuff into input buffer of child process.
>   .read(from)  where `from` is either subprocess.STDOUT or STDERR
> Both functions should be marked as non-blocking in docs and returning None
> if pipe is closed.
>
>     When calling the Popen._recv function, it requires the pipe name be
>>>>    passed as an argument so there exists the Popen.recv function that
>>>> passes
>>>>    selects stdout as the pipe for Popen._recv by default.
>>>>  Popen.recv_err
>>>>    selects stderr as the pipe by default. "Popen.recv" and
>>>> "Popen.recv_err"
>>>>    are much easier to read and understand than "Popen._recv('stdout'
>>>> ..." and
>>>>    "Popen._recv('stderr' ..." respectively.
>>>>
>>>
>> What about reading from other file descriptors?  subprocess.Popen allows
>> arbitrary file descriptors to be used.  Is there any provision here for
>> reading and writing non-blocking from or to those?
>
>
> On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is
> select. Of course a test is needed, but why it should not just work?
>
>
>>     Since the Popen._recv function does not wait on data to be produced
>>>>    before returning a value, it may return empty bytes.  Popen.asyncread
>>>>    handles this issue by returning all data read over a given time
>>>>    interval.
>>>>
>>>
>> Oh.  Popen.asyncread?   What's that?  This is the first time the PEP
>> mentions it.
>
>
> I guess that's for blocking read with timeout.
> Among the most popular questions about Python it is the question number
> ~500.
> http://stackoverflow.com/questions/1191374/subprocess-with-timeout
>
>
>>     The ProcessIOWrapper class uses the asyncread and asyncwrite
>>>> functions to
>>>>    allow a process to act like a file so that there are no blocking
>>>> issues
>>>>    that can arise from using the stdout and stdin file objects produced
>>>> from
>>>>    a subprocess.Popen call.
>>>>
>>>
>> What's the ProcessIOWrapper class?  And what's the asyncwrite function?
>> Again, this is the first time it's mentioned.
>>
>
> Oh. That's a wrapper to access subprocess pipes with familiar file API. It
> is interesting:
>
> http://code.google.com/p/subprocdev/source/browse/subprocess.py?name=python3k
>
>
>> So, to sum up, I think my main comment is that the PEP seems to be
>> missing a significant portion of the details of what it's actually
>> proposing.  I suspect that this information is present in the
>> implementation, which I have not looked at, but it probably belongs in the
>> PEP.
>>
>> Jean-Paul
>>
>
> Writing PEPs is definitely a job, and a hard one for developers. Too bad a
> good idea *and* implementation (tests needed) is put on hold, because there
> is nobody, who can help with that part.
>
> IMHO PEP needs to expand on user stories even if there is significant
> amount of cited sources, a practical summary and problem illustration by
> examples are missing.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20121208/bd446541/attachment.html>


More information about the Python-Dev mailing list