Enhanced Generators - reiterating over a file
oren-py-l at hishome.net
Sun Feb 3 14:02:17 CET 2002
On Sat, Feb 02, 2002 at 10:05:11PM -0500, Kragen Sitaker wrote:
> > > That won't work when passed iterators as arguments, right?
> > Sure, but as long as iterators are kept as invisible temporary objects
> Um, sure. iter(open("hello")) doesn't work that way already.
This is a real life example of how useful reiterable x functions can be
and how to solve the problem of reiterating over a file.
The following function takes a vector of samples and yields a stream of samples
with their values normalized to the range +-1:
max_value = 0.0
for sample in vector:
max_value = max(max_value, abs(sample))
for sample in vector:
Note that this must be a two-pass operation: you cannot yield the first
normalized sample before you find the maximum value.
My data is in a text file with one decimal sample per line. This code reads
it and writes the normalized samples to another file in the same format.
vector = map(float, file('samples.dat'))
outfile = file('normalized.dat','w')
for sample in normalized(vector):
print >>outfile, sample
This works in Python 2.2. The only problem is that it reads the entire file
into memory. What if the file is too big to read into memory or I just don't
want to stress virtual memory unnecessarily?
Just add the magic x! change map->xmap, file->xfile and everything works
exactly the same but without any temporary lists.
xfile is a lazy file object: the object just stores the filename. Only when
iter(xfile('filename')) is called the returned iterator object opens a
temporary file descriptor to walk through the file.
xmap must be the truly lazy version that returns an iterable object, not the
half-eager half-lazy version that returns an iterator. This is because the
function normalized() has to scan the source twice.
An xfile object really simulates a container - you can use iter() to get
multiple independent iterators of the same container. It should also appeal
to the fans of a certain TV show :-)
A Python file object is not really a container: it can be argued that a file
object already *is* a kind of iterator. It is a temporary object used to walk
through a container. The real container in this case is the actual file on
the disk. An xfile object represents a file on the disk.
More information about the Python-list