[Python-ideas] Hooks into the IO system to intercept raw file reads/writes

Guido van Rossum guido at python.org
Mon Feb 2 17:31:53 CET 2015


I'm all for flexible I/O processing, but I worry that the idea brought up
here feels a little half-baked. First of all, it seems to mention two
separate cases of subclassing (both io.RawIOBase and subprocess.Popen).
These days, subclassing(*) is often an anti-pattern: unless done with
considerable foresight, every detail of the base class implementation
essentially becomes part of the interface that the subclass relies upon,
and now the base class becomes too constrained in its evolution. In my
experience, a well-done API is usually much easier to evolve than even a
very-well-done base class.

The other thing is that I can't actually imagine the details of your
proposal. Is the idea that you subclass RawIOBase to implement "tee"
behavior? Why can't you do that at the receiving end? Is perhaps the
proposal to assign the base object a work-around for a interface design in
the Popen class? (I'm sure that class is far from perfect -- but it's also
super constrained by the need to support Windows process creation.)

_____
(*) I'm talking about subclassing as an API mechanism. A set of
interrelated classes can work well if they are all part of the same
package, so their implementations can evolve together as needed. But when
proposing APIs which serve as important abstractions, it's much better if
new abstractions are built by combining and wrapping objects rather than by
subclassing.

On Mon, Feb 2, 2015 at 6:53 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> There's a lot of flexibility in the new layered IO system, but one
> thing it doesn't allow is any means of adding "hooks" to the data
> flow, or manipulation of an already-created io object.
>
> For example, when a subprocess.Popen object uses a pipe for the
> child's stdout, the data is captured instead of writing it to the
> console. Sometimes it would be nice to capture it, but still write to
> the console. That would be easy to do if we could wrap the underlying
> RawIOBase object and intercept read() calls[1]. A subclass of
> RawIOBase can do this trivially, but there's no way of replacing the
> class on an existing stream.
>
> The obvious approach would be to reassign the "raw" attribute of the
> BufferedIOBase object, but that's readonly. Would it be possible to
> make it read/write? Or provide another way of replacing the raw IO
> object underlying an io object?
>
> I'm sure there are buffer integrity issues to work out, but are there
> any more fundamental problems with this approach?
>
> Paul
>
> [1] Actually, it's *not* that easy, because subprocess.Popen objects
> are insanely hard to subclass - there are no hooks into the pipe
> creation process, and no way to intercept the object before the
> subprocess gets run (that happens in the __init__ method). But that's
> a separate issue, and also the subject of a different thread here.
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150202/c43555e3/attachment.html>


More information about the Python-ideas mailing list