
On Tue, Nov 28, 2017 at 12:55 AM, Steven D'Aprano <steve@pearwood.info> wrote:
But even if we decide on a simple rule like "iterator unpacking depends on the number of targets, all other iterables don't", I think that will be a bug magnet. It will mean that you can't rely on this special behaviour unless you surround each call with a type check:
if isinstance(it, collections.abc.Iterator): # special case for iterators x, y = it else: # sequences keep the old behaviour x, y = it[:2]
Nah, far easier: x, y = iter(it) since that'll be a no-op in the first case, and trigger new behaviour in the second. However, I don't like this behaviour-switch. I'd much rather have actual syntax.
I don't want Python trying to *guess* whether I want to unpack the entire iteratable or just two items. Whatever tiny convenience there is from when Python guesses correctly will be outweighed by the nuisance value of when it guesses wrongly.
Exactly.
There are some Pros: 1. No overhead
No overhead compared to what?
I think the point here (correct me if I'm wrong?) is that it takes work to probe the iterator to see if there's a third item, so grabbing just the first two items is simply *doing less work*. It's not doing MORE work (constructing an islice object, pumping it, then discarding it) - it's simply skipping the check that it would otherwise do.
2. Readable and not so verbose code 3. Optimized case for x,y,*z = iterator
The semantics of that are already set: the first two items are assigned to x and y, with all subsequent items assigned to z as a list. How will this change optimize this case? It still needs to run through the iterator to generate the list.
Maybe 'optimized case for "x, y, *_ = iterator" where you then never use _ and it has no side effects'? But that could be worded better.
In many cases it is possible to do this right now, but in too verbose way:
x, y = islice(gen(), 2)
I don't think that is excessively verbose.
But maybe we should consider allowing slice notation on arbitrary iterators:
x, y = it[:2]
I do think islice is verbose, but the main problem is that you have to match the second argument to the number of assignment targets. Slice notation is an improvement, but it still has that same problem. But perhaps this should be added to the list of options for the PEP.
Perhaps a better idea might be special syntax to tell the interpreter you don't want to run the right-hand side to completion. "Explicit is better than implicit" -- maybe something special like:
x, y, * = iterable
will attempt to extract exactly two items from iterable, without advancing past the second item. And it could work the same for sequences, iterators, lazy sequences like range, and any other iterable.
I don't love having yet another meaning for * but that would be better than changing the standard behaviour of iterator unpacking.
That's one of the options that I mentioned, as it's been proposed in the past. The problem is that it depends on internal whitespace to distinguish it from augmented assignment; granted, there's no way to use "*=" with multiple targets (or even in the single-target case, you can't do "x,*=it" with the comma in it), but that's still a readability problem. ChrisA