<div dir="ltr">On Thu, Oct 16, 2014 at 4:34 AM, Antoine Pitrou <span dir="ltr"><<a href="mailto:solipsis@pitrou.net" target="_blank">solipsis@pitrou.net</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span class="">On Thu, 16 Oct 2014 03:54:32 +0300<br>

Paul Sokolovsky <<a href="mailto:pmiscml@gmail.com">pmiscml@gmail.com</a>> wrote:<br>

> Hello,<br>

><br>

> io.RawIOBase.read() is well specified for behavior in case it<br>

> immediately gets a would-block condition: "If the object is in<br>

> non-blocking mode and no bytes are available, None is returned."<br>

> (<a href="https://docs.python.org/3/library/io.html#io.RawIOBase.read" target="_blank">https://docs.python.org/3/library/io.html#io.RawIOBase.read</a>).<br>

><br>

> However, nothing is said about such condition for io.IOBase.readline(),<br>

> which is mixin method in a base class, default implementation of which<br>

> thus would use io.RawIOBase.read(). Looking at 3.4.0 source, iobase.c:<br>

> iobase_readline() has:<br>

><br>

>         b = _PyObject_CallMethodId(self, &PyId_read, "n", nreadahead);<br>

> [...]<br>

>         if (!PyBytes_Check(b)) {<br>

>             PyErr_Format(PyExc_IOError,<br>

>                          "read() should have returned a bytes object, "<br>

>                          "not '%.200s'", Py_TYPE(b)->tp_name);<br>

><br>

> I.e. it's not even ready to receive legitimate return value of None<br>

> from read(). I didn't try to write a testcase though, so may be missing<br>

> something.<br>

><br>

> So, how readline() should behave in this case, and can that be<br>

> specified in the Library Reference?<br>

<br>

</span>Well, the problem is that it's not obvious how to implement such methods<br>

in a non-blocking context.<br>

<br>

Let's says some data is received but there isn't a complete line.<br>

Should readline() return just that data (an incomplete line)? That<br>

breaks the API's contract. Should readline() buffer the incomplete line<br>

and keep it for the next readline() call? But then the internal buffer<br>

becomes unbounded: perhaps there is no new line in the next 4GB of<br>

incoming data...<br>

<br>

And besides, raw I/O objects *shouldn't* have an internal buffer. That's<br>

the role of the buffered I/O layer.<br clear="all"></blockquote><div> <br></div><div>Well, occasionally this occurs, and I think it's reasonable for readline() to deal with it.<br><br></div><div>The argument about a 4 GB buffer is irrelevant -- this can happen with a blocking underlying stream too.<br><br></div><div>I think that at the point where the readline() function says to itself "I need more data" it should ask the underlying stream for data. If that returns an empty string, meaning EOF, readline() is satisfied and return whatever it has buffered (even if it's empty). If that returns some bytes containing a newline, readline() is satisfied, returns the  data up to that point, and buffers the rest (if any). If the underlying stream returns None, I think it makes sense for readline() to return None too -- attempting to read more will just turn into a busy-wait loop, and that's the opposite of what should happen.<br><br></div><div>You may argue that the caller of readline() doesn't expect this. Sure. But in the end, if the stream is unbuffered and the caller isn't prepared for that, the caller will always get in trouble. Maybe it'll treat the None as EOF. That's fine -- it would be the same if it was calling read() on the underlying stream and it got None (the EOF signalling is the same in both cases).<br><br>At least, by being prepared for the None from the underlying read() in the readline() code, someone who knows what they are doing can use readline() on a non-blocking stream -- when they receive None they will have to ask their selector (or whatever they use) to wait for the underlying FD and then they can try again.<br><br></div><div>(Alternatively, we could raise BlockingIOError, which is that the OS level read() raises if there's no data immediately available on a non-blocking FD; but it seems that streams have already gotten a convention of returning None instead, so I think that should be propagated up the stack.)<br><br></div><div>Oh, BTW, I tested this a little bit. Currently readline() returns an empty string (or empty bytes, depending on which level you use) when the stream is nonblocking. I think returning None makes muck more sense.<br></div></div><br>-- <br>--Guido van Rossum (<a href="http://python.org/~guido">python.org/~guido</a>)

</div></div>