[Python-ideas] non-blocking buffered I/O

Antoine Pitrou solipsis at pitrou.net
Mon Oct 29 22:25:41 CET 2012


On Mon, 29 Oct 2012 10:03:00 -0700
Guido van Rossum <guido at python.org> wrote:
> >> Then there is a
> >> BufferedReader class that implements more traditional read() and
> >> readline() coroutines (i.e., to be invoked using yield from), the
> >> latter handy for line-oriented transports.
> >
> > Well... It would be nice if BufferedReader could re-use the actual
> > io.BufferedReader and its fast readline(), read(), readinto()
> > implementations.
> 
> Agreed, I would love that too, but the problem is, *this*
> BufferedReader defines methods you have to invoke with yield from.
> Maybe we can come up with a solution for sharing code by modifying the
> _io module though; that would be great! (I've also been thinking of
> layering TextIOWrapper on top of these.)

There is a rather infamous issue about _io.BufferedReader and
non-blocking I/O at http://bugs.python.org/issue13322
It is a bit problematic because currently non-blocking readline()
returns '' instead of None when no data is available, meaning EOF can't
be easily detected :(

Once this issue is solved, you could use _io.BufferedReader, and
workaround the "partial read/readline result" issue by iterating
(hopefully in most cases there is enough data in the buffer to 
return a complete read or readline, so the C optimizations are useful).
Here is how it may work:

def __init__(self, fd):
    self.fd = fd
    self.bufio = _io.BufferedReader(...)

def readline(self):
    chunks = []
    while True:
        line = self.bufio.readline()
        if line is not None:
            chunks.append(line)
            if line == b'' or line.endswith(b'\n'):
                # EOF or EOL
                return b''.join(chunks)
        yield from scheduler.block_r(self.fd)

def read(self, n):
    chunks = []
    bytes_read = 0
    while True:
        data = self.bufio.read(n - bytes_read)
        if data is not None:
            chunks.append(data)
            bytes_read += len(data)
            if data == b'' or bytes_read == n:
                # EOF or read satisfied
                break
        yield from scheduler.block_r(self.fd)
    return b''.join(chunks)


As for TextIOWrapper, AFAIR it doesn't handle non-blocking I/O at all
(but my memories are vague).

By the way I don't know how this whole approach (of mocking socket-like
or file-like objects with coroutine-y read() / readline() methods)
lends itself to plugging into Windows' IOCP. You may rely on some raw
I/O object that registers a callback when a read() is requested and
then yields a Future object that gets completed by the callback.
I'm sure Richard has some ideas about that :-)

Regards

Antoine.





More information about the Python-ideas mailing list