Re: [Python-Dev] PEP 3145 (With Contents)

Dec. 9, 2012

      I'm really not sure what this PEP is trying to get at given that it
contains no examples and sounds from the descriptions to be adding a
complicated api on top of something that already, IMNSHO, has too much it
(subprocess.Popen).

Regardless, any user can use the stdout/err/in file objects with their own
code that handles them asynchronously (yes that can be painful but that is
what is required for _any_ socket or pipe I/O you don't want to block on).

It *sounds* to me like this entire PEP could be written and released as a
third party module on PyPI that offers a subprocess.Popen subclass adding
some more convenient non-blocking APIs.  That's where I'd start if I were
interested in this as a future feature.

-gps

On Fri, Dec 7, 2012 at 5:10 PM, anatoly techtonik <techtonik@gmail.com>wrote:
...
On Tue, Sep 15, 2009 at 9:24 PM, <exarkun@twistedmatrix.com> wrote:
...
On 04:25 pm, eric.pruitt@gmail.com wrote:
...
I'm bumping this PEP again in hopes of getting some feedback.
This is useful, indeed. ActiveState recipe for this has 10 votes, which is
high for ActiveState (and such hardcore topic FWIW).
...
On Tue, Sep 8, 2009 at 23:52, Eric Pruitt <eric.pruitt@gmail.com> wrote:
...
...
PEP: 3145
Title: Asynchronous I/O For subprocess.Popen
Author: (James) Eric Pruitt, Charles R. McCreary, Josiah Carlson
Type: Standards Track
Content-Type: text/plain
Created: 04-Aug-2009
Python-Version: 3.2
Abstract:
In its present form, the subprocess.Popen implementation is prone to
   dead-locking and blocking of the parent Python script while waiting
on data
   from the child process.
Motivation:
A search for "python asynchronous subprocess" will turn up numerous
   accounts of people wanting to execute a child process and
communicate with
   it from time to time reading only the data that is available instead
of
   blocking to wait for the program to produce data [1] [2] [3].  The
current
   behavior of the subprocess module is that when a user sends or
receives
   data via the stdin, stderr and stdout file objects, dead locks are
common
   and documented [4] [5].  While communicate can be used to alleviate
some of
   the buffering issues, it will still cause the parent process to
block while
   attempting to read data when none is available to be read from the
child
   process.
Rationale:
There is a documented need for asynchronous, non-blocking
functionality in
   subprocess.Popen [6] [7] [2] [3].  Inclusion of the code would
improve the
   utility of the Python standard library that can be used on Unix
based and
   Windows builds of Python.  Practically every I/O object in Python
has a
   file-like wrapper of some sort.  Sockets already act as such and for
   strings there is StringIO.  Popen can be made to act like a file by
simply
   using the methods attached the the subprocess.Popen.stderr, stdout
and
   stdin file-like objects.  But when using the read and write methods
of
   those options, you do not have the benefit of asynchronous I/O.  In
the
   proposed solution the wrapper wraps the asynchronous methods to
mimic a
   file object.
Reference Implementation:
I have been maintaining a Google Code repository that contains all
of my
   changes including tests and documentation [9] as well as blog
detailing
   the problems I have come across in the development process [10].
I have been working on implementing non-blocking asynchronous I/O in
the
   subprocess.Popen module as well as a wrapper class for
subprocess.Popen
   that makes it so that an executed process can take the place of a
file by
   duplicating all of the methods and attributes that file objects have.
"Non-blocking" and "asynchronous" are actually two different things. From
the rest of this PEP, I think only a non-blocking API is being introduced.
 I haven't looked beyond the PEP, though, so I might be missing something.
I suggest renaming http://www.python.org/dev/peps/pep-3145/ to
'Non-blocking I/O for subprocess' and continue. IMHO on this stage is where
examples with deadlocks that occur with current subprocess
implementation are badly needed.
There are two base functions that have been added to the
...
...
...
subprocess.Popen
   class: Popen.send and Popen._recv, each with two separate
implementations,
   one for Windows and one for Unix based systems.  The Windows
   implementation uses ctypes to access the functions needed to control
pipes
   in the kernel 32 DLL in an asynchronous manner.  On Unix based
systems,
   the Python interface for file control serves the same purpose.  The
   different implementations of Popen.send and Popen._recv have
identical
   arguments to make code that uses these functions work across multiple
   platforms.
Why does the method for non-blocking read from a pipe start with an "_"?
This is the convention (widely used) for a private API.  The name also
doesn't suggest that this is the non-blocking version of reading.
Similarly, the name "send" doesn't suggest that this is the non-blocking
version of writing.
The implementation is based on http://code.activestate.com/recipes/440554/which is more clearly illustrates integrated functionality.
_recv() is a private base function, which is takes stdout or stderr as
parameter. Corresponding user-level functions to read from stdout and
stderr are .recv() and .recv_err()
I thought about renaming API to .asyncread() and .asyncwrite(), but that
may mean that you call method and then result asynchronously start to fill
some buffer, which is not the case here.
Then I thought about .check_read() and .check_write(), literally meaning
'check and read' or 'check and return' for non-blocking calls if there is
nothing. But then again, poor naming convention of subprocess uses
.check_output() for blocking read until command completes.
Currently, subversion doesn't have .read and .write methods. It may be the
best option:
  .write(what)  to pipe more stuff into input buffer of child process.
  .read(from)  where `from` is either subprocess.STDOUT or STDERR
Both functions should be marked as non-blocking in docs and returning None
if pipe is closed.
When calling the Popen._recv function, it requires the pipe name be
...
...
...
passed as an argument so there exists the Popen.recv function that
passes
   selects stdout as the pipe for Popen._recv by default.
 Popen.recv_err
   selects stderr as the pipe by default. "Popen.recv" and
"Popen.recv_err"
   are much easier to read and understand than "Popen._recv('stdout'
..." and
   "Popen._recv('stderr' ..." respectively.
What about reading from other file descriptors?  subprocess.Popen allows
arbitrary file descriptors to be used.  Is there any provision here for
reading and writing non-blocking from or to those?
On Windows it is WriteFile/ReadFile and PeekNamedPipe. On Linux it is
select. Of course a test is needed, but why it should not just work?
...
Since the Popen._recv function does not wait on data to be produced
...
...
before returning a value, it may return empty bytes.  Popen.asyncread
   handles this issue by returning all data read over a given time
   interval.
Oh.  Popen.asyncread?   What's that?  This is the first time the PEP
mentions it.
I guess that's for blocking read with timeout.
Among the most popular questions about Python it is the question number
~500.
http://stackoverflow.com/questions/1191374/subprocess-with-timeout
...
The ProcessIOWrapper class uses the asyncread and asyncwrite
...
...
functions to
   allow a process to act like a file so that there are no blocking
issues
   that can arise from using the stdout and stdin file objects produced
from
   a subprocess.Popen call.
What's the ProcessIOWrapper class?  And what's the asyncwrite function?
Again, this is the first time it's mentioned.
Oh. That's a wrapper to access subprocess pipes with familiar file API. It
is interesting:
http://code.google.com/p/subprocdev/source/browse/subprocess.py?name=python3...
...
So, to sum up, I think my main comment is that the PEP seems to be
missing a significant portion of the details of what it's actually
proposing.  I suspect that this information is present in the
implementation, which I have not looked at, but it probably belongs in the
PEP.
Jean-Paul
Writing PEPs is definitely a job, and a hard one for developers. Too bad a
good idea *and* implementation (tests needed) is put on hold, because there
is nobody, who can help with that part.
IMHO PEP needs to expand on user stories even if there is significant
amount of cited sources, a practical summary and problem illustration by
examples are missing.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/greg%40krypto.org

Re: [Python-Dev] PEP 3145 (With Contents)

Gregory P. Smith