eof
Duncan Booth
duncan.booth at invalid.invalid
Thu Nov 22 07:26:12 EST 2007
braver <deliverable at gmail.com> wrote:
> In many cases, you want to do this:
>
> for line in f:
> <do something with the line, setup counts and things>
> if line % 1000 == 0 or f.eof(): # eof() doesn't exist in Python
> yet!
> <use the setup variables and things to process the chunk>
>
> My control logic summarizes every 1000 lines of a file. I have to
> issue the summary after each 1000 lines, or whatever incomplete tail
> chunk remains. If I do it after the for loop, I have to refactor my
> logic into a procedure to call it twice. Now I want to avoid the
> overhead of the procedure call, and generally for a script to keep it
> simple.
This sounds like a case for writing a generator. Try this one:
----- begin chunks.py -------
import itertools
def chunks(f, size):
iterator = iter(f)
def onechunk(line):
yield line
for line in itertools.islice(iterator, size-1):
yield line
for line in iterator:
yield onechunk(line)
for chunk in chunks(open('chunks.py'), 3):
for n, line in enumerate(chunk):
print "%d:%s" % (n,line.rstrip())
print "---------------"
print "done"
#eof
------ end chunks.py --------
Ths output when you run this is:
C:\Temp>chunks.py
0:import itertools
1:def chunks(f, size):
2: iterator = iter(f)
---------------
0: def onechunk(line):
1: yield line
2: for line in itertools.islice(iterator, size-1):
---------------
0: yield line
1: for line in iterator:
2: yield onechunk(line)
---------------
0:
1:for chunk in chunks(open('chunks.py'), 3):
2: for n, line in enumerate(chunk):
---------------
0: print "%d:%s" % (n,line.rstrip())
1: print "---------------"
2:print "done"
---------------
0:#eof
---------------
done
Or change it to do:
for chunk in chunks(enumerate(open('chunks.py')), 3):
for n, line in chunk:
and you get all lines numbered from 0 to 15 instead of resetting the
count each chunk.
More information about the Python-list
mailing list