[Python-Dev] PEP 3145 (With Contents)
Gregory P. Smith
greg at krypto.org
Sun Dec 9 05:17:52 CET 2012
I'm really not sure what this PEP is trying to get at given that it
contains no examples and sounds from the descriptions to be adding a
complicated api on top of something that already, IMNSHO, has too much it
(subprocess.Popen).
Regardless, any user can use the stdout/err/in file objects with their own
code that handles them asynchronously (yes that can be painful but that is
what is required for _any_ socket or pipe I/O you don't want to block on).
It *sounds* to me like this entire PEP could be written and released as a
third party module on PyPI that offers a subprocess.Popen subclass adding
some more convenient non-blocking APIs. That's where I'd start if I were
interested in this as a future feature.
-gps
On Fri, Dec 7, 2012 at 5:10 PM, anatoly techtonik <techtonik at gmail.com>wrote:
> On Tue, Sep 15, 2009 at 9:24 PM, <exarkun at twistedmatrix.com> wrote:
>
>> On 04:25 pm, eric.pruitt at gmail.com wrote:
>>
>>> I'm bumping this PEP again in hopes of getting some feedback.
>>>
>>
> This is useful, indeed. ActiveState recipe for this has 10 votes, which is
> high for ActiveState (and such hardcore topic FWIW).
>
>
>> On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <eric.pruitt at gmail.com> wrote:
>>>
>>>> PEP: 3145
>>>> Title: Asynchronous I/O For subprocess.Popen
>>>> Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
>>>> Type: Standards Track
>>>> Content-Type: text/plain
>>>> Created: 04-Aug-2009
>>>> Python-Version: 3.2
>>>>
>>>> Abstract:
>>>>
>>>> In its present form, the subprocess.Popen implementation is prone to
>>>> dead-locking and blocking of the parent Python script while waiting
>>>> on data
>>>> from the child process.
>>>>
>>>> Motivation:
>>>>
>>>> A search for "python asynchronous subprocess" will turn up numerous
>>>> accounts of people wanting to execute a child process and
>>>> communicate with
>>>> it from time to time reading only the data that is available instead
>>>> of
>>>> blocking to wait for the program to produce data [1] [2] [3]. The
>>>> current
>>>> behavior of the subprocess module is that when a user sends or
>>>> receives
>>>> data via the stdin, stderr and stdout file objects, dead locks are
>>>> common
>>>> and documented [4] [5]. While communicate can be used to alleviate
>>>> some of
>>>> the buffering issues, it will still cause the parent process to
>>>> block while
>>>> attempting to read data when none is available to be read from the
>>>> child
>>>> process.
>>>>
>>>> Rationale:
>>>>
>>>> There is a documented need for asynchronous, non-blocking
>>>> functionality in
>>>> subprocess.Popen [6] [7] [2] [3]. Inclusion of the code would
>>>> improve the
>>>> utility of the Python standard library that can be used on Unix
>>>> based and
>>>> Windows builds of Python. Practically every I/O object in Python
>>>> has a
>>>> file-like wrapper of some sort. Sockets already act as such and for
>>>> strings there is StringIO. Popen can be made to act like a file by
>>>> simply
>>>> using the methods attached the the subprocess.Popen.stderr, stdout
>>>> and
>>>> stdin file-like objects. But when using the read and write methods
>>>> of
>>>> those options, you do not have the benefit of asynchronous I/O. In
>>>> the
>>>> proposed solution the wrapper wraps the asynchronous methods to
>>>> mimic a
>>>> file object.
>>>>
>>>> Reference Implementation:
>>>>
>>>> I have been maintaining a Google Code repository that contains all
>>>> of my
>>>> changes including tests and documentation [9] as well as blog
>>>> detailing
>>>> the problems I have come across in the development process [10].
>>>>
>>>> I have been working on implementing non-blocking asynchronous I/O in
>>>> the
>>>> subprocess.Popen module as well as a wrapper class for
>>>> subprocess.Popen
>>>> that makes it so that an executed process can take the place of a
>>>> file by
>>>> duplicating all of the methods and attributes that file objects have.
>>>>
>>>
>> "Non-blocking" and "asynchronous" are actually two different things. From
>> the rest of this PEP, I think only a non-blocking API is being introduced.
>> I haven't looked beyond the PEP, though, so I might be missing something.
>
>
> I suggest renaming http://www.python.org/dev/peps/pep-3145/ to
> 'Non-blocking I/O for subprocess' and continue. IMHO on this stage is where
> examples with deadlocks that occur with current subprocess
> implementation are badly needed.
>
> There are two base functions that have been added to the
>>>> subprocess.Popen
>>>> class: Popen.send and Popen._recv, each with two separate
>>>> implementations,
>>>> one for Windows and one for Unix based systems. The Windows
>>>> implementation uses ctypes to access the functions needed to control
>>>> pipes
>>>> in the kernel 32 DLL in an asynchronous manner. On Unix based
>>>> systems,
>>>> the Python interface for file control serves the same purpose. The
>>>> different implementations of Popen.send and Popen._recv have
>>>> identical
>>>> arguments to make code that uses these functions work across multiple
>>>> platforms.
>>>>
>>>
>> Why does the method for non-blocking read from a pipe start with an "_"?
>> This is the convention (widely used) for a private API. The name also
>> doesn't suggest that this is the non-blocking version of reading.
>> Similarly, the name "send" doesn't suggest that this is the non-blocking
>> version of writing.
>
>
> The implementation is based on http://code.activestate.com/recipes/440554/which is more clearly illustrates integrated functionality.
>
> _recv() is a private base function, which is takes stdout or stderr as
> parameter. Corresponding user-level functions to read from stdout and
> stderr are .recv() and .recv_err()
>
> I thought about renaming API to .asyncread() and .asyncwrite(), but that
> may mean that you call method and then result asynchronously start to fill
> some buffer, which is not the case here.
>
> Then I thought about .check_read() and .check_write(), literally meaning
> 'check and read' or 'check and return' for non-blocking calls if there is
> nothing. But then again, poor naming convention of subprocess uses
> .check_output() for blocking read until command completes.
>
> Currently, subversion doesn't have .read and .write methods. It may be the
> best option:
> .write(what) to pipe more stuff into input buffer of child process.
> .read(from) where `from` is either subprocess.STDOUT or STDERR
> Both functions should be marked as non-blocking in docs and returning None
> if pipe is closed.
>
> When calling the Popen._recv function, it requires the pipe name be
>>>> passed as an argument so there exists the Popen.recv function that
>>>> passes
>>>> selects stdout as the pipe for Popen._recv by default.
>>>> Popen.recv_err
>>>> selects stderr as the pipe by default. "Popen.recv" and
>>>> "Popen.recv_err"
>>>> are much easier to read and understand than "Popen._recv('stdout'
>>>> ..." and
>>>> "Popen._recv('stderr' ..." respectively.
>>>>
>>>
>> What about reading from other file descriptors? subprocess.Popen allows
>> arbitrary file descriptors to be used. Is there any provision here for
>> reading and writing non-blocking from or to those?
>
>
> On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is
> select. Of course a test is needed, but why it should not just work?
>
>
>> Since the Popen._recv function does not wait on data to be produced
>>>> before returning a value, it may return empty bytes. Popen.asyncread
>>>> handles this issue by returning all data read over a given time
>>>> interval.
>>>>
>>>
>> Oh. Popen.asyncread? What's that? This is the first time the PEP
>> mentions it.
>
>
> I guess that's for blocking read with timeout.
> Among the most popular questions about Python it is the question number
> ~500.
> http://stackoverflow.com/questions/1191374/subprocess-with-timeout
>
>
>> The ProcessIOWrapper class uses the asyncread and asyncwrite
>>>> functions to
>>>> allow a process to act like a file so that there are no blocking
>>>> issues
>>>> that can arise from using the stdout and stdin file objects produced
>>>> from
>>>> a subprocess.Popen call.
>>>>
>>>
>> What's the ProcessIOWrapper class? And what's the asyncwrite function?
>> Again, this is the first time it's mentioned.
>>
>
> Oh. That's a wrapper to access subprocess pipes with familiar file API. It
> is interesting:
>
> http://code.google.com/p/subprocdev/source/browse/subprocess.py?name=python3k
>
>
>> So, to sum up, I think my main comment is that the PEP seems to be
>> missing a significant portion of the details of what it's actually
>> proposing. I suspect that this information is present in the
>> implementation, which I have not looked at, but it probably belongs in the
>> PEP.
>>
>> Jean-Paul
>>
>
> Writing PEPs is definitely a job, and a hard one for developers. Too bad a
> good idea *and* implementation (tests needed) is put on hold, because there
> is nobody, who can help with that part.
>
> IMHO PEP needs to expand on user stories even if there is significant
> amount of cited sources, a practical summary and problem illustration by
> examples are missing.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/greg%40krypto.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20121208/bd446541/attachment.html>
More information about the Python-Dev
mailing list