[Python-Dev] Status of PEP 3145 - Asynchronous I/O for subprocess.popen

Josiah Carlson josiah.carlson at gmail.com
Fri Mar 28 06:09:47 CET 2014


By digging into the internals of a subprocess produced by Popen(), you can
write in a blocking manner to the stdin pipe, and read in a blocking manner
from the stdout/stderr pipe(s). For scripting most command-line operations,
the lack of timeouts and the ability to *stop* trying to read is as
important as being able to spawn an external process. It kind-of kills that
side of the usefulness of Python as a tool for scripting.

The question is not whether or not a user of Python can dig into the
internals, make some calls, then get it to be non-blocking - the existence
of two different patches to do so (the most recent of which is from 4 1/2
years ago) shows that it *can* be done. The question is whether or not the
desire for the functionality warrants having functions or methods to
perform these operations in the standard library.

I and others have claimed that it should go into the standard library.
Heck, there was enough of a push that Eric got paid to write his version of
the functionality for a GSoC project in 2009. There has even been activity
on the bug itself unrelated to deferring discussions as recently as May
2012 (after which activity seems to have paused for reasons I don't know).
Some people have raised reasonable questions about the API and
implementation, but no one is willing to offer an alternative API that they
think would be better, so discussions about implementation of a
non-existent API for inclusion are moot.


But honestly, I have approximately zero faith that what I say or do will
lead to the inclusion of any changes to the subprocess module. Which is why
I'm offering to write a short example that uses asyncio for inclusion in
the docs. It's not what I've wanted for almost 9 years, but at least it has
a chance of actually happening. I'll take a chance at updating the docs
instead of a 3 to 9 month bikeshedding just to lead to rejection any day.


So yeah. Someone want to make a decision? Tell me to write the docs, I
will. Tell me to go take a long walk off a short pier, I'll thank you for
your time and leave you alone.

 - Josiah



On Thu, Mar 27, 2014 at 7:18 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 3/27/2014 9:16 PM, Josiah Carlson wrote:
>
>> You don't understand the point because you don't understand the feature
>> request or PEP. That is probably my fault for not communicating the
>> intent better in the past. The feature request and PEP were written to
>> offer something like the below (or at least enough that the below could
>> be built with minimal effort):
>>
>> def do_login(...):
>>      proc = subprocess.Popen(...)
>>      current = proc.recv(timeout=5)
>>      last_line = current.rstrip().rpartition('\n')[-1]
>>      if last_line.endswith('login:'):
>>          proc.send(username)
>>          if proc.readline(timeout=5).rstrip().endswith('password:'):
>>              proc.send(password)
>>              if 'welcome' in proc.recv(timeout=5).lower():
>>                  return proc
>>      proc.kill()
>>
>> The API above can be very awkward (as shown :P ), but that's okay. From
>> those building blocks a (minimally) enterprising user would add
>> functionality to suit their needs. The existing subprocess module only
>> offers two methods for *any* amount of communication over pipes with the
>> subprocess: check_output() and communicate(), only the latter of which
>> supports sending data (once, limited by system-level pipe buffer
>> lengths). Neither allow for nontrivial interactions from a single
>> subprocess.Popen() invocation.
>>
>
> According to my reading of the doc, one should (in the absence of
> deadlocks, and without having timeouts) be able to use proc.stdin.write and
> proc.stdout.read. Do those not actually work?
>
>
>
>  The purpose was to be able to communicate
>
>> in a bidirectional manner with a subprocess without blocking, or
>> practically speaking, blocking with a timeout. That's where the "async"
>> term comes from. Again, there was never any intent to have the
>> functionality be part of asyncore or any other asynchronous sockets
>> framework, which is why there are no handle_*() methods, readable(),
>> writable(), etc.
>>
>> Your next questions will be: But why bother at all? Why not just build
>> the piece you need *inside* asyncio? Why does this need anything more?
>> The answer to those questions are wants and needs. If I'm a user that
>> needs interactive subprocess handling, I want to be able to do something
>> like the code snippet above. The last thing I need is to have to rewrite
>> the way my application/script/whatever handles *everything* just because
>> a new asynchronous IO library has been included in the Python standard
>> library - it's a bit like selling you a $300 bicycle when you need a $20
>> wheel for your scooter.
>>
>> That there *now* exists the ability to have async subprocesses as part
>> of asyncio is a fortunate happenstance, as the necessary underlying
>> tools for building the above now exist in the standard library. It's a
>> matter of properly embedding the asyncio-related bits inside a handful
>> of functions to provide something like the above, which is what I was
>> offering to write. But why not keep working on the subprocess module?
>> Yep. Tried that. Coming up on 9 years since I created the feature
>> request and original Activestate recipe. To go that route is going to be
>> 2-3 times as much work as has already been dedicated to get somewhere
>> remotely acceptable for inclusion in Python 3.5, but more likely,
>> subsequent rejection for similar reasons why it has been in limbo.
>>
>> But here's the thing: I can build enough using asyncio in 30-40 lines of
>> Python to offer something like the above API. The problem is that it
>> really has no natural home. It uses asyncio, so makes no sense to put in
>> subprocess. It doesn't fit the typical asyncio behavior, so doesn't make
>> sense to put in asyncio. The required functionality isn't big enough to
>> warrant a submodule anywhere. Heck, it's even way too small to toss into
>> an external PyPI module. But in the docs? It would show an atypical, but
>> not wholly unreasonable use of asyncio (the existing example already
>> shows what I would consider to be an atypical use of asyncio). It would
>> provide a good starting point for someone who just wants/needs something
>> like the snippet above. It is *yet another* use-case for asyncio. And it
>> could spawn a larger library for offering a more fleshed-out
>> subprocess-related API, though that is probably more wishful thinking on
>> my part than anything.
>>
>>   - Josiah
>>
>>
>>
>> On Thu, Mar 27, 2014 at 4:24 PM, Victor Stinner
>> <victor.stinner at gmail.com <mailto:victor.stinner at gmail.com>> wrote:
>>
>>     2014-03-27 22:52 GMT+01:00 Josiah Carlson <josiah.carlson at gmail.com
>>     <mailto:josiah.carlson at gmail.com>>:
>>
>>      > * Because it is example docs, maybe a multi-week bikeshedding
>>     discussion
>>      > about API doesn't need to happen (as long as "read line", "read X
>>     bytes",
>>      > "read what is available", and "write this data" - all with
>>     timeouts - are
>>      > shown, people can build everything else they want/need)
>>
>>     I don't understand this point. Using asyncio, you can read and write a
>>     single byte or a whole line. Using functions like asyncio.wait_for(),
>>     it's easy to add a timeout on such operation.
>>
>>     Victor
>>
>>
>>
>>
>>
>
> --
> Terry Jan Reedy
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> josiah.carlson%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20140327/d02d90fc/attachment-0001.html>


More information about the Python-Dev mailing list