[Python-ideas] zip() problem.

Erik python at lucidity.plus.com
Fri Feb 12 19:52:41 EST 2016


On 13/02/16 00:09, Andrew Barnert wrote:
> On Feb 12, 2016, at 15:51, Michael Selik <mike at selik.org
> <mailto:mike at selik.org>> wrote:
>
>> BTW, from the documentation
>> (https://docs.python.org/3/library/functions.html#zip):
>>
[snip]

> I think what's missing (from his point of view) is some statement that
> if you call it with iterators, it's not just the trailing values, but
> also the iterators' states that you shouldn't care about.

Yes. And also, as someone else pointed out to me privately, the 
docstring can be interpreted as already covering this:

"""
Return a zip object whose .__next__() method returns a tuple where
the i-th element comes from the i-th iterable argument.  The .__next__()
method continues until the shortest iterable in the argument sequence
is exhausted and then it raises StopIteration.
"""

It just depends on what "The .__next__() method continues" is supposed 
to mean. I can see that for the obvious implementation it means what 
actually happens (because the method is implemented in the obvious way 
;)), but it _could_ be interpreted as meaning that when __next__ is 
called when the shortest iterable is exhausted then it does not do 
anything at all.

This is in the docstring of a function that will be called by casual and 
newbie users. Are they expected to read up on what __next__ means and 
mentally imagine the mechanics of the loop that is implementing this 
function for them so they understand all the side-effects?

> I always took that as read without it needing to be stated. But maybe it
> does need stating?

There is also the issue that the CPython implementation of this is not 
necessarily the _only_ way of implementing this. Another implementation 
might construct the tuple in reverse order for example, and a different 
set of iterators have the extra value consumed. I don't think it's 
unreasonable to state clearly that _any_ iterator longer than the 
shortest may or may not have at least one extra value extracted from it, 
which will then be discarded.

Anyway, I'm over it now. I just thought I'd mention it. I've obviously 
never run into a real problem with it in the wild so perhaps it's really 
not an issue.

E.


More information about the Python-ideas mailing list