"for line in fd" twice

Oren Tirosh oren-py-l at hishome.net
Tue Jun 3 13:23:12 EDT 2003


On Tue, Jun 03, 2003 at 05:24:16PM +0200, Thomas Güttler wrote:
> Hi!
> 
> the following does not work:
> 
> """
> fd=open(file)
> for line in fd:
>     if line.startswith("mymark"):
>         break
> for line in fd:
>     #Read lines after "mymark"
>     ....
> fd.close()
> """
> 
> The second loop misses some lines.

"for line in fd" internally invokes iter(fd). In python 2.2.x this returns
an xreadlines object. This object has an internal readahead buffer that 
is lost when the iteration is stopped with the break statement and the
xreadlines object is deallocated. The second for loop skips some 
arbitrary number of lines ahead.

In 2.3 the readahead buffer has been moved into the file object itself
and file objects are iterators instead of having an iterator.

> It works with the old way:
> 
> """
> while 1:
>     line=fd.readline()
>     if not line:
>         break
> """

Yes, it works and it is fully compatible with seeks, mixing read() and
readline() etc. But it's also slower because it does not do readahead
buffering.

    Oren





More information about the Python-list mailing list