Python 3000, zip, *args and iterators

Steven Bethard steven.bethard at gmail.com
Wed Dec 29 15:52:08 EST 2004


Raymond Hettinger wrote:
> [Steven Bethard]  I'm just suggesting that in a function with a
> 
>>*args in the def, the args variable be an iterator instead of
>>a tuple.
> 
> 
> So people would lose the useful abilities to check len(args) or extract
> an argument with args[1]?

No more than you lose these abilities with any other iterators:

     def f(x, y, *args):
        args = list(args) # or tuple(args)
        if len(args) == 3:
           print args[0], args[1], args[2]

True, if you do want to check argument counts, this is an extra step of 
work.  I personally find that most of my functions with *args parameters 
look like:

     def f(x, y, *args):
         do_something1(x)
         do_something2(y)
         for arg in args:
             do_something3(arg)

where having *args be an iterable would not be a problem.

>> So basically what I've done here is to
>>"transpose" (to use your word) the iterators, apply my function, and
>>then transpose the iterators back.
> 
> If you follow the data movements, you'll find that iterators provide no
> advantage here.  To execute transpose(map(f, transpose(iterator)), the
> whole iterator necessarily has to be read into memory so that the first
> function application will have all of its arguments present -- using
> the star operator only obscures that fact.

I'm not sure I follow you here.  Looking at my code:

     labels, feature_dicts = starzip(generator)
     for label, feature_window in izip(labels, window(feature_dicts)):
         write_instance(label, combine_dicts(feature_widow))

A few points:

(1) starzip uses itertools.tee, so it is not going to read the entire 
contents of the generator in at once as long as the two parallel 
iterators do not run out of sync

(2) window does not exhaust the iterator passed to it; instead, it uses 
the items of that iterator to generate a new iterator in sync with the 
original, so izip(labels, window(feature_dicts)) will keep the labels 
and feature_dicts iterators in sync.

(3) the for loop just iterates over the izip iterator, so it should be 
consuming (label, feature_window) pairs in sync.

I assume you disagree with one of these points or you wouldn't say that 
"iterators provide no advantage here".  Could you explain what doesn't 
work here?

Steve



More information about the Python-list mailing list