Python 3000, zip, *args and iterators

Steven Bethard steven.bethard at gmail.com
Sun Dec 26 17:02:39 EST 2004


So, as I understand it, in Python 3000, zip will basically be replaced 
with izip, meaning that instead of returning a list, it will return an 
iterator.  This is great for situations like:

     zip(*[iter1, iter2, iter3])

where I want to receive tuples of (item1, item2, item3) from the 
iterables.  But it doesn't work well for a situation like:

     zip(*tuple_iter)

where tuple_iter is an iterator to tuples of the form
(item1, item2, item3) and I want to receive three iterators, one to the 
  item1s, one to the item2s and one to the item3s.  I don't think this 
is too unreasonable of a desire as the current zip, in a situation like:

     zip(*tuple_list)

where tuple_list is a list of tuples of the form (item1, item2, item3), 
returns a list of three tuples, one of the item1s, one of the item2s and 
one of the item3s.

Of course, the reason this doesn't work currently is that the fn(*itr) 
notation converts 'itr' into a tuple, exhausting the iterator:

 >>> def g(x):
...     for i in xrange(x):
...         yield (i, i+1, i+2)
...     print "exhausted"
...
 >>> zip(*g(4))
exhausted
[(0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5)]
 >>> it.izip(*g(4))
exhausted
<itertools.izip object at 0x01157710>
 >>> x, y, z = it.izip(*g(4))
exhausted
 >>> x, y, z
((0, 1, 2, 3), (1, 2, 3, 4), (2, 3, 4, 5))

What I would prefer is something like:

 >>> zip(*g(4))
<iterator object at ...>
 >>> x, y, z = zip(*g(4))
 >>> x, y, z
(<iterator object at ...>, <iterator object at ..., <iterator object at ...)

Of course, I can write a separate function that will do what I want 
here[1] -- my question is if Python's builtin zip will support this in 
Python 3000.  It's certainly not a trivial change -- it requires some 
pretty substantially backwards incompatible changes in how *args is 
parsed for a function call -- namely that fn(*itr) only extracts as many 
of the items in the iterable as necessary, e.g.

 >>> def h(x, y, *args):
...     print x, y, args
...     print list(it.islice(args, 4))
...
 >>> h(*it.count())
0 1 count(2)
[2, 3, 4, 5]

So I guess my real question is, should I expect Python 3000 to play 
nicely with *args and iterators?  Are there reasons (besides backwards 
incompatibility) that parsing *args this way would be bad?


Steve


[1] In fact, with the help of the folks from this list, I did:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302325



More information about the Python-list mailing list