mixing for x in file: and file.readline

Russell E. Owen owen at astro.washington.edu
Fri Sep 12 13:57:47 EDT 2003


In article <mailman.1063364898.12243.python-list at python.org>,
 Oren Tirosh <oren-py-l at hishome.net> wrote:

>On Thu, Sep 11, 2003 at 01:54:53PM -0700, Russell E. Owen wrote:
>> At one time, mixing for x in file and readline was dangerous. For 
>> example:
>> 
>> for line in file:
>>   # read some lines from a file, then break
>> nextline = readline() # bad
>> 
>> would not do what a naive user might expect because the file iterator 
>> buffered data and readline did not read from that buffer. Hence the call 
>> to readline might unexpectedly skip some lines...

(Oren points out that it's still a problem in Python 2.3 and after some 
interesting and gory detail goes on to say...)

>Really fixing it amounts to reimplementing the entire I/O layer of 
>Python with a different strategy and thoroughly testing on multiple 
>platforms. 
>
>It's possible to hide the problem in most cases by making read and 
>readline use the iteration readahead buffer if it's attached to the file
>object and stdio if it isn't. I don't think it's a good idea. It will
>require some hairy code and and seems susceptible to subtle bugs and
>corner cases.

I agree that fixing read would probably be too messy to justify.

But it seems to me that a simple reimplementation of readline() would 
work fine:

def readline(self):
   try:
      return self.next()
   except StopIteration
      return ""

That's basically the way I ended up working around the problem (but I 
didn't try to modify any classes). I do see two issues with that fix:
- existing code (if any) that mixes readlines and read would be harmed
- it may not be efficient enough (even implemented in C)

>Another alternative it to make read and readline fail noisily after 
>iteration starts (unless cleared by seek())

If readlines cannot be fixed, this might be worth doing since I think 
it's a common thing to want to mix readlines and iteration. If read is 
the only issue, I suspect adding a warning to the documentation for file 
method "read" would suffice.

I'm wondering where the problem is discussed in the manual. I'm pretty 
sure I saw it recently, but when I read about file methods I saw nothing 
about it.

-- Russell




More information about the Python-list mailing list