[Python-3000] Draft PEP for New IO system

Tue Feb 27 19:51:47 CET 2007

On 2/27/07, Jim Jewett <jimjjewett at gmail.com> wrote:
> On 2/27/07, Guido van Rossum <guido at python.org> wrote:
> > On 2/27/07, Paul Moore <p.f.moore at gmail.com> wrote:
>
> > It *may* even be useful if many of these would support non-blocking
> > I/O; we're currently considering adding a standard API for returning
> > "EWOULDBLOCK" errors (e.g. return None from read() and write()) --
> > though we won't be providing an API to turn that on (since it depends
> > on the underlying implementation, e.g. sockets vs. files).
>
> I thought the point of the IO subsystem was to abstract away those differences.

We will abstract away the differences of how you *use* a stream that's
in non-blocking (or timeout) mode. but we can't abstract away the APIs
used to *request* those modes since the APi depends on the abilities
of the system object -- sockets, pipes and disk files all have
different semantics here.

> Trying to set (non-)blocking may raise an exception on some streams,
> but that still seems better than having to know the internal details
> before you can even ask.

I doubt it -- non-blocking mode is pretty specialized. I want it to be
*possible* to use the new I/O library with file descriptors that can
return EWOULDBLOCK; I don't necessarily want to make it *easy*.

> > > > The BufferedReader implementation is for sequential-access read-only
> > > > objects.  It does not provide a .flush() method, since there is no
> > > > sensible circumstance where the user would want to discard the read
> > > > buffer.
>
> > > ... typeahead problems.
>
> > ... outside the scope of the PEP; you can do this by
> > (somehow) enabling non-blocking mode and then reading until you get
> > None.
>
> That does sound like a use case, and flush() is the obvious method.

No it isn't. Calling flush() for writing has no semantics at the
highest-level abstraction: you can insert flush() calls whenever you
want or omit them and the data will still be written; the only time
you care is when the abstraction is broken and you lose a buffer due
to a segfault etc. The semantics of this use case are very different;
perhaps we can add a reset() or discard() method which throws away the
buffer contents but that's as far as I want to go. The passwd-reading
example ought to be hidden in the getpass module.

> Are you concerned that having the (rarely needed) method available may
> be an attractive nuisance or source of confusion?

Perhaps; people will latch on to a name and call it; or they will
mindlessly copy code that happens to contain it and a new voodoo
religion or superstition is easily born. Also whether this makes sense
or not depends a lot on what kind of device you are reading; I can't
imagine a socket use case for example.

> > I think for input we should always accept all three line endings so
> > you never need to specify anything; for output, we should pick ...
>
> So saving a text file can cause (whitespace) changes all over?

It would only normalize line endings, but yeah.

> That might be OK, but it should at least be called out, so that
> editors wanting minimal change will know that they have to implement
> their own Text layer.

I expect them to do that anyway. But I would not be against being able
to specify newline="\n" on input and have it mean that \r\n line
endings remain in the data where present. I'm not sure that I would
like newline="\r\n" to mean that a lone \n should not be considered a
line ending, even if some stupid Windows apps behave that way.

A compromise would be to support what "U" mode currently does -- it
makes the line endings actually encountered available as an attribute
on the file.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)