iterators and generators, is this the Python Way???
jepler at unpythonic.net
jepler at unpythonic.net
Wed Sep 11 22:40:13 EDT 2002
On Wed, Sep 11, 2002 at 10:17:05PM -0400, Michael Schneider wrote:
> Hello All.
>
> I just updated my python to 2.2 and noticed that generators, and
> iterators were in there.
>
> I often parse through many files (one at a time), and wanted the speed
> of read, and
> the convience of for x in myGen:
If this beats 'for line in open("myfile"): ...' by much, there's a problem ..
can you post some benchmark numbers? Anyway, I don't think this can outperform the simpler
for x in file.readlines(): ...
since yours adds another layer of Python code in the middle...
I think that files can be iterated over in 2.2, I only have 2.3 on hand:
$ python2.3 -c 'import sys; print list(sys.stdin)' < /etc/resolv.conf
['search unpythonic.net\n', 'search localnet\n', 'nameserver 206.222.212.218\n']
f.xreadlines() or iter(f) should both operate in the same way, by using
readlines with a size hint, then returning successive lines from the
returned list, refilling the list as necessary. At the time, this
benchmarked "just as fast as" slurping the whole file with a single
readlines(), but with a reasonable memory footprint even for gigabyte files.
(At least when none of the lines are insanely long)
For benchmarks made at the time, start near
http://mail.python.org/pipermail/python-dev/2001-January/011269.html
Tim Peters quoted Neel Krishnaswami:
| Quick performance summary of the current solutions:
|
| Slowest: for line in fileinput.input('foo'): # Time 100
| : while 1: line = file.readline() # Time 75
| : for line in LinesOf(open('foo')): # Time 25
| Fastest: for line in file.readlines(): # Time 10
| while 1: lines = file.readlines(hint) # Time 10
| for line in xreadlines(file): # Time 10
|
| The difference in speed between the slowest and fastest is about
| a factor of 10.
Jeff
More information about the Python-list
mailing list