[Python-ideas] non-blocking buffered I/O
Guido van Rossum
guido at python.org
Mon Oct 29 23:08:54 CET 2012
On Mon, Oct 29, 2012 at 2:25 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Mon, 29 Oct 2012 10:03:00 -0700
> Guido van Rossum <guido at python.org> wrote:
>> >> Then there is a
>> >> BufferedReader class that implements more traditional read() and
>> >> readline() coroutines (i.e., to be invoked using yield from), the
>> >> latter handy for line-oriented transports.
>> >
>> > Well... It would be nice if BufferedReader could re-use the actual
>> > io.BufferedReader and its fast readline(), read(), readinto()
>> > implementations.
>>
>> Agreed, I would love that too, but the problem is, *this*
>> BufferedReader defines methods you have to invoke with yield from.
>> Maybe we can come up with a solution for sharing code by modifying the
>> _io module though; that would be great! (I've also been thinking of
>> layering TextIOWrapper on top of these.)
>
> There is a rather infamous issue about _io.BufferedReader and
> non-blocking I/O at http://bugs.python.org/issue13322
> It is a bit problematic because currently non-blocking readline()
> returns '' instead of None when no data is available, meaning EOF can't
> be easily detected :(
Eeew!
> Once this issue is solved, you could use _io.BufferedReader, and
> workaround the "partial read/readline result" issue by iterating
> (hopefully in most cases there is enough data in the buffer to
> return a complete read or readline, so the C optimizations are useful).
Yes, that's what I'm hoping for.
> Here is how it may work:
>
> def __init__(self, fd):
> self.fd = fd
> self.bufio = _io.BufferedReader(...)
>
> def readline(self):
> chunks = []
> while True:
> line = self.bufio.readline()
> if line is not None:
> chunks.append(line)
> if line == b'' or line.endswith(b'\n'):
> # EOF or EOL
> return b''.join(chunks)
> yield from scheduler.block_r(self.fd)
>
> def read(self, n):
> chunks = []
> bytes_read = 0
> while True:
> data = self.bufio.read(n - bytes_read)
> if data is not None:
> chunks.append(data)
> bytes_read += len(data)
> if data == b'' or bytes_read == n:
> # EOF or read satisfied
> break
> yield from scheduler.block_r(self.fd)
> return b''.join(chunks)
Hm... I wonder if it would make more sense if these standard APIs were
to return specific exceptions, like the ssl module does in
non-blocking mode? Look here (I updated since posting last night):
http://code.google.com/p/tulip/source/browse/sockets.py#142
> As for TextIOWrapper, AFAIR it doesn't handle non-blocking I/O at all
> (but my memories are vague).
Same suggestion... (I only found out about ssl's approach to async I/O
a few days ago. It felt brilliant and right to me. But maybe I'm
missing something?)
> By the way I don't know how this whole approach (of mocking socket-like
> or file-like objects with coroutine-y read() / readline() methods)
> lends itself to plugging into Windows' IOCP.
Me neither. I hope Steve Dower can tell us.
> You may rely on some raw
> I/O object that registers a callback when a read() is requested and
> then yields a Future object that gets completed by the callback.
> I'm sure Richard has some ideas about that :-)
Which Richard?
--
--Guido van Rossum (python.org/~guido)
More information about the Python-ideas
mailing list