itertools.izip brokeness

Peter Otten __peter__ at web.de
Tue Jan 3 06:08:36 EST 2006


rurpy at yahoo.com wrote:

> The problem is that sometimes, depending on which
> file is the shorter, a line ends up missing,
> appearing neither in the izip() output, or in
> the subsequent direct file iteration.  I would
> guess that it was in izip's buffer when izip
> terminates due to the exception on the other file.

With the current iterator protocol you cannot feed an item that you've read
from an iterator by calling its next() method back into it; but invoking
next() is the only way to see whether the iterator is exhausted. Therefore
the behaviour that breaks your prt_files() function has nothing to do with
the itertools. 
I think of itertools more as of a toolbox instead of a set of ready-made
solutions and came up with

from itertools import izip, chain, repeat

def prt_files (file1, file2):
    file1 = chain(file1, repeat(""))
    file2 = chain(file2, repeat(""))
    for line1, line2 in iter(izip(file1, file2).next, ("", "")):
        print line1.rstrip(), "\t", line2.rstrip()
 
which can easily be generalized for an arbitrary number of files.

Peter




More information about the Python-list mailing list