Oren Tirosh wrote:
Xreadlines is buffered and therefore leaves the file position of the file in an unexpected state. If you use xreadlines explicitly you should expect that. The fact that file.__iter__ returns an xreadlines object implicitly is therefore a bit surprising.
What's the reason for using xreadlines as a file iterator? Was it performance or was it just the easiest way to implement it using an existing object?
The rationale was something like "the simple most way to iterate over the lines in a file should be the fastest". I'd agree with that, but not at the expense of the surprises mentioned in the bug. I would perhaps help if the file object would cache the xreadlines iterator, that would limit the scope of the problem to the case where iteration and explicit .read() calls are mixed.
"Files support the iterator protocol. Each iteration returns the same result as file.readline()"
This is not correct. Files support what I call the iterable protocol. Objects supporting the iterator protocol have a .next() method, files don't. While it's true that each iteration has the same result as readline it doesn't have the same side effects.
Proposal: make files really support the iterator protocol. __iter__ would return self and next() would call readline and raise StopIteration if ''. If anyone wants the xreadline performance improvement it should be explicit.
+1 (But, since the bug is closed as "won't fix" I doubt this has a big chance of happening.) Just