Warning about "for line in file:"
Oren Tirosh
oren-py-l at hishome.net
Mon Feb 18 05:34:28 EST 2002
On Fri, Feb 15, 2002 at 03:47:26PM -0500, Brian Kelley wrote:
> count = 0
> for line in file.xreadlines():
> if count > 10: break
> print line
> count = count + 1
>
> for line in file.xreadlines():
> print line
>
> So what is REALLY happening is that you are creating two seperate
> iterators in the above examples. Writing "for line in file" instead of
> "for line in file.xreadlines()" simply hides and confuses this.
If you trace the problem to its true source you will see that file objects
are not really containers that can be iterated - they are already iterators.
The container is the file on the disk. A file iterator object is not a
real independent object, just a different protocol to access the file object
using next() and StopIteration instead of readline() and an empty string.
The buffering problem that started this thread is just a side-effect of this
case of mistaken identity: iterators pretending to be containers.
There is no need for a separate object to implement another protocol. A
single object can expose both the iterator and file protocols:
class file_(file):
def __iter__(self):
return self
def xreadlines(self):
return self
def next():
s = self.readline()
if s:
return self
else:
raise StopIteration
I believe it is fully backward compatible with existing sources that use
file iteration. Implementing file iterators using xreadlines objects was just
the quickest way to do it by reusing an existing piece code that was written
long before the iterator protocol.
Oren
More information about the Python-list
mailing list