[Python-ideas] Re: zip(x, y, z, strict=True)

20 Apr 2020

      ...
On Apr 20, 2020, at 17:22, Steven D'Aprano <steve@pearwood.info> wrote:
On Mon, Apr 20, 2020 at 03:28:09PM -0700, Andrew Barnert via Python-ideas wrote:
...
Admittedly, such cases are almost surely not that common, but I
actually have some line-numbering code that did something like this
(simplified a bit from real code):
yield from enumerate(itertools.chain(headers, [''], body, [''])
… but then I needed to know how many lines I yielded, and there’s no
way to get that from enumerate, so instead I had to do this:
Did you actually need to "yield from"? Unless your caller was sending 
values into the enumerate iterable, which as far as I know enumerate 
doesn't support, "yield from" isn't necessary.
True. Using yield from is more efficient, more composeable, and usually (but not here) more concise and readable, but none of those are relevant to my example (or the real code). I suppose it’s just a matter of habit to reach for yield from before a loop over yield even in cases where it doesn’t matter much.
...
...
counter = itertools.count()
yield from zip(counter, itertools.chain(headers, [''], body, [''])
lines = next(counter)
That gives you one more than the number of lines yielded.
Yeah, I screwed that up in simplifying the real code without testing the result. And your version gives one _less_ than the number yielded. (With either enumerate(xs) or zip(counter, xs) the last element will be (len(xs)-1, xs[-1]). Your version has the additional problem that if the iterable is empty, t is not off by one but unbound (or bound to some stale old value)—but that’s not possible in my example, and probably not in most similar examples.

Both are easy to fix in practice, but both (as we just demonstrated) even easier to get wrong the first time, like all fencepost errors. Maybe it would be better to use an undoable/peekable/tee wrapper after all, but without writing it out I’m not sure that wouldn’t be just as fencepostable…

Anyway, that’s exactly why I want to make sure the fencepost behavior is actually defined for this new proposal. Any reasonable answer is probably fine; people probably won’t run into wanting the leftovers, but if they ever do, as long as the docs say what should be there, they’ll work it out.

That, and the implementation constraint. If everyone were convinced that the only reasonable answer is to fully consume all inputs on error, that would be a bit of a problem, so it’s worth making sure nobody is convinced of that.