By digging into the internals of a subprocess produced by Popen(), you can write in a blocking manner to the stdin pipe, and read in a blocking manner from the stdout/stderr pipe(s). For scripting most command-line operations, the lack of timeouts and the ability to *stop* trying to read is as important as being able to spawn an external process. It kind-of kills that side of the usefulness of Python as a tool for scripting.

The question is not whether or not a user of Python can dig into the internals, make some calls, then get it to be non-blocking - the existence of two different patches to do so (the most recent of which is from 4 1/2 years ago) shows that it *can* be done. The question is whether or not the desire for the functionality warrants having functions or methods to perform these operations in the standard library.

I and others have claimed that it should go into the standard library. Heck, there was enough of a push that Eric got paid to write his version of the functionality for a GSoC project in 2009. There has even been activity on the bug itself unrelated to deferring discussions as recently as May 2012 (after which activity seems to have paused for reasons I don't know). Some people have raised reasonable questions about the API and implementation, but no one is willing to offer an alternative API that they think would be better, so discussions about implementation of a non-existent API for inclusion are moot.


But honestly, I have approximately zero faith that what I say or do will lead to the inclusion of any changes to the subprocess module. Which is why I'm offering to write a short example that uses asyncio for inclusion in the docs. It's not what I've wanted for almost 9 years, but at least it has a chance of actually happening. I'll take a chance at updating the docs instead of a 3 to 9 month bikeshedding just to lead to rejection any day.


So yeah. Someone want to make a decision? Tell me to write the docs, I will. Tell me to go take a long walk off a short pier, I'll thank you for your time and leave you alone.

 - Josiah



On Thu, Mar 27, 2014 at 7:18 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 3/27/2014 9:16 PM, Josiah Carlson wrote:
You don't understand the point because you don't understand the feature
request or PEP. That is probably my fault for not communicating the
intent better in the past. The feature request and PEP were written to
offer something like the below (or at least enough that the below could
be built with minimal effort):

def do_login(...):
     proc = subprocess.Popen(...)
     current = proc.recv(timeout=5)
     last_line = current.rstrip().rpartition('\n')[-1]
     if last_line.endswith('login:'):
         proc.send(username)
         if proc.readline(timeout=5).rstrip().endswith('password:'):
             proc.send(password)
             if 'welcome' in proc.recv(timeout=5).lower():
                 return proc
     proc.kill()

The API above can be very awkward (as shown :P ), but that's okay. From
those building blocks a (minimally) enterprising user would add
functionality to suit their needs. The existing subprocess module only
offers two methods for *any* amount of communication over pipes with the
subprocess: check_output() and communicate(), only the latter of which
supports sending data (once, limited by system-level pipe buffer
lengths). Neither allow for nontrivial interactions from a single
subprocess.Popen() invocation.

According to my reading of the doc, one should (in the absence of deadlocks, and without having timeouts) be able to use proc.stdin.write and proc.stdout.read. Do those not actually work?



 The purpose was to be able to communicate
in a bidirectional manner with a subprocess without blocking, or
practically speaking, blocking with a timeout. That's where the "async"
term comes from. Again, there was never any intent to have the
functionality be part of asyncore or any other asynchronous sockets
framework, which is why there are no handle_*() methods, readable(),
writable(), etc.

Your next questions will be: But why bother at all? Why not just build
the piece you need *inside* asyncio? Why does this need anything more?
The answer to those questions are wants and needs. If I'm a user that
needs interactive subprocess handling, I want to be able to do something
like the code snippet above. The last thing I need is to have to rewrite
the way my application/script/whatever handles *everything* just because
a new asynchronous IO library has been included in the Python standard
library - it's a bit like selling you a $300 bicycle when you need a $20
wheel for your scooter.

That there *now* exists the ability to have async subprocesses as part
of asyncio is a fortunate happenstance, as the necessary underlying
tools for building the above now exist in the standard library. It's a
matter of properly embedding the asyncio-related bits inside a handful
of functions to provide something like the above, which is what I was
offering to write. But why not keep working on the subprocess module?
Yep. Tried that. Coming up on 9 years since I created the feature
request and original Activestate recipe. To go that route is going to be
2-3 times as much work as has already been dedicated to get somewhere
remotely acceptable for inclusion in Python 3.5, but more likely,
subsequent rejection for similar reasons why it has been in limbo.

But here's the thing: I can build enough using asyncio in 30-40 lines of
Python to offer something like the above API. The problem is that it
really has no natural home. It uses asyncio, so makes no sense to put in
subprocess. It doesn't fit the typical asyncio behavior, so doesn't make
sense to put in asyncio. The required functionality isn't big enough to
warrant a submodule anywhere. Heck, it's even way too small to toss into
an external PyPI module. But in the docs? It would show an atypical, but
not wholly unreasonable use of asyncio (the existing example already
shows what I would consider to be an atypical use of asyncio). It would
provide a good starting point for someone who just wants/needs something
like the snippet above. It is *yet another* use-case for asyncio. And it
could spawn a larger library for offering a more fleshed-out
subprocess-related API, though that is probably more wishful thinking on
my part than anything.

  - Josiah



On Thu, Mar 27, 2014 at 4:24 PM, Victor Stinner
<victor.stinner@gmail.com <mailto:victor.stinner@gmail.com>> wrote:

    2014-03-27 22:52 GMT+01:00 Josiah Carlson <josiah.carlson@gmail.com
    <mailto:josiah.carlson@gmail.com>>:

     > * Because it is example docs, maybe a multi-week bikeshedding
    discussion
     > about API doesn't need to happen (as long as "read line", "read X
    bytes",
     > "read what is available", and "write this data" - all with
    timeouts - are
     > shown, people can build everything else they want/need)

    I don't understand this point. Using asyncio, you can read and write a
    single byte or a whole line. Using functions like asyncio.wait_for(),
    it's easy to add a timeout on such operation.

    Victor






--
Terry Jan Reedy