iterators and generators, is this the Python Way???

Wed Sep 11 22:40:13 EDT 2002

On Wed, Sep 11, 2002 at 10:17:05PM -0400, Michael Schneider wrote:
> Hello All.
> 
> I just updated my python to 2.2 and noticed that generators, and 
> iterators were in there.
> 
> I often parse through many files (one at a time), and wanted the speed 
> of read,  and
> the convience of   for x in myGen:

If this beats 'for line in open("myfile"): ...' by much, there's a problem .. 
can you post some benchmark numbers?  Anyway, I don't think this can outperform the simpler
	for x in file.readlines(): ...
since yours adds another layer of Python code in the middle...

I think that files can be iterated over in 2.2, I only have 2.3 on hand:

$ python2.3 -c 'import sys; print list(sys.stdin)' < /etc/resolv.conf 
['search unpythonic.net\n', 'search localnet\n', 'nameserver 206.222.212.218\n']

f.xreadlines() or iter(f) should both operate in the same way, by using
readlines with a size hint, then returning successive lines from the
returned list, refilling the list as necessary.  At the time, this
benchmarked "just as fast as" slurping the whole file with a single
readlines(), but with a reasonable memory footprint even for gigabyte files.
(At least when none of the lines are insanely long)

For benchmarks made at the time, start near
    http://mail.python.org/pipermail/python-dev/2001-January/011269.html

Tim Peters quoted Neel Krishnaswami:
| Quick performance summary of the current solutions:
|
| Slowest: for line in fileinput.input('foo'):     # Time 100
|        : while 1: line = file.readline()         # Time 75
|        : for line in LinesOf(open('foo')):       # Time 25
| Fastest: for line in file.readlines():           # Time 10
|          while 1: lines = file.readlines(hint)   # Time 10
|          for line in xreadlines(file):           # Time 10
|
| The difference in speed between the slowest and fastest is about
| a factor of 10.

Jeff