[Python-ideas] Hooks into the IO system to intercept raw file reads/writes

Guido van Rossum guido at python.org
Mon Feb 2 18:47:58 CET 2015


Perhaps you would be better off using the subprocess machinery in asyncio?
It uses the subprocess module to manage the subprocess itself (both for
Windows and for Unix-ish systems) but uses the async I/O machinery from the
asyncio module (which is itself based on select or its better brethren).

On Mon, Feb 2, 2015 at 9:10 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On 2 February 2015 at 16:31, Guido van Rossum <guido at python.org> wrote:
> > I'm all for flexible I/O processing, but I worry that the idea brought up
> > here feels a little half-baked. First of all, it seems to mention two
> > separate cases of subclassing (both io.RawIOBase and subprocess.Popen).
> > These days, subclassing(*) is often an anti-pattern: unless done with
> > considerable foresight, every detail of the base class implementation
> > essentially becomes part of the interface that the subclass relies upon,
> and
> > now the base class becomes too constrained in its evolution. In my
> > experience, a well-done API is usually much easier to evolve than even a
> > very-well-done base class.
> >
> > The other thing is that I can't actually imagine the details of your
> > proposal. Is the idea that you subclass RawIOBase to implement "tee"
> > behavior? Why can't you do that at the receiving end? Is perhaps the
> > proposal to assign the base object a work-around for a interface design
> in
> > the Popen class? (I'm sure that class is far from perfect -- but it's
> also
> > super constrained by the need to support Windows process creation.)
>
> The idea is certainly a little half-baked :-( And you're absolutely
> right that it's strongly linked to a fight to work around limitations
> of subprocess.Popen. The suggestion originally came out of a couple of
> things I've been working on, one of which was trying to make a Popen
> call that captured the stdout/stderr streams while still displaying
> them (as you say, a "tee" type of mechanism).
>
> It's certainly possible to do the "tee" at the receiving end, but
> (because of the aforementioned Popen limitations) doing so requires
> ignoring the convenience of communicate() and writing your own capture
> code. That's not *too* hard using threads, but Popen avoids threads on
> Unix, using a select loop instead, and I'm not clear why, and whether
> my solution will break in the situations the Popen code is covering
> via the select loop. Also, getting corner cases in the capture code
> right (around encodings in particular) is something I'd prefer to
> leave to subprocess :-) The original issue was for a PR for a project
> that works on a lot of platforms I don't have access to, so I may well
> have been worrying too much about "not breaking stuff" :-)
>
> This proposal basically came from a feeling that if only I could "see"
> the data as it flows through the buffers of an existing io stream, I
> wouldn't have all these problems. Originally I was going to suggest a
> "buffer filled" type of callback. With such a hook, though, I was
> thinking I could do
>
> p = Popen(..., stdout=PIPE, stderr=PIPE)
> # Not sure if these need to be at the Raw IO level or the buffered IO
> level. Should be called every time an OS read happens.
> p.stdout.buffer.add_buffer_watcher(lambda buf:
> os.write(sys.stdout.fileno(), buf))
> p.stderr.buffer.add_buffer_watcher(lambda buf:
> os.write(sys.stderr.fileno(), buf))
>
> I guess that's a cleaner proposal, although I pretty much assumed that
> the overhead of such a hook being checked for on every buffer read
> would be unacceptable. So I came up with a clumsier approach based on
> trying to make it so you only paid the cost if you used the feature.
> Overall, that was probably a mistake :-(
>
> I hope it's clearer now.
>
> Paul
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150202/3f7d3b48/attachment.html>


More information about the Python-ideas mailing list