iterating over the lines of a file - difference between Python 2.7 and 3?
tjreedy at udel.edu
Thu Jan 17 14:39:52 CET 2013
On 1/17/2013 7:04 AM, Peter Otten wrote:
> Wolfgang Maier wrote:
>> I just came across an unexpected behavior in Python 3.3, which has to do
>> with file iterators and their interplay with other methods of file/IO
>> class methods, like readline() and tell(): Basically, I got used to the
>> fact that it is a bad idea to mix them because the iterator would use that
>> hidden read-ahead buffer, so what you got with subsequent calls to
>> readline() or tell() was what was beyond that buffer, but not the next
>> thing after what the iterator just returned.
>> for line in in_file_object:
>> print (line)
>> if in_file_object.tell() > 300:
>> # assuming that individual lines are
>> # shorter
>> This wouldn´t print anything in Python 2.7 since next(in_file_object)
>> would read ahead beyond the 300 position immediately, as evidenced by a
>> subsequent call to in_file_object.tell() (returning 8192 on my system).
>> However, I find that under Python 3.3 this same code works: it prints some
>> lines from my file and after completing in_file_object.tell() returns a
>> quite reasonable 314 as the current position in the file.
>> I couldn´t find this difference anywhere in the documentation. Is the 3.3
>> behavior official, and if so, when was it introduced and how is it
>> implemented? I assume the read-ahead buffer still exists?
> You can get the Python 3 behaviour with io.open() in Python 2.7. There is an
> implementation in Python in _pyio.py:
> def tell(self):
> return _BufferedIOMixin.tell(self) - len(self._read_buf) +
In 2.7, open returns file object, which is a thin wrapper of the
particular (proprietary) C compiler stdio library. They vary because the
C standard leaves some things implementation-defined, and people
interpret differently (no official test suite, at least not originally),
and people make mistakes. The io module is intended to bring more
uniformity, and there is a test suite for other implementations to match
actual behavior to.
Terry Jan Reedy
More information about the Python-list