
On Mon, Nov 27, 2017 at 12:17:31PM +0300, Kirill Balunov wrote:
Currently during assignment, when target list is a comma-separated list of targets (*without "starred" target*) the rule is that the object (rhs) must be an iterable with the same number of items as there are targets in the target list. That is, no check is performed on the number of targets present, and if something goes wrong the ValueError is raised.
That's a misleading description: ValueError is raised when the number of targets is different from the number of items. I consider that to be performing a check on the number of targets.
To show this on simple example:
from itertools import count, islice it = count() x, y = it it count(3)
For everyone else who was confused by this, as I was, that's not actually a copy and paste from the REPL. There should be a ValueError raised after the x, y assignment. As given, it is confusing because it looks like the assignment succeeded, when in fact it didn't.
Here the count was advanced two times but assignment did not happen.
Correct, because there was an exception raised.
I found that in some cases it is too much restricting that rhs must have the same number of items as targets. It is proposed that if the rhs is a generator or an iterator (better some object that yields values on demand), the assignmenet should be lazy and dependent on the number of targets.
I think that's problematic. How do you know what objects that yields values on demand? Not all lazy iterables are iterators: there are also lazy sequences like range. But even if we decide on a simple rule like "iterator unpacking depends on the number of targets, all other iterables don't", I think that will be a bug magnet. It will mean that you can't rely on this special behaviour unless you surround each call with a type check: if isinstance(it, collections.abc.Iterator): # special case for iterators x, y = it else: # sequences keep the old behaviour x, y = it[:2]
I find this feature to be very convenient for interactive use,
There are many things which would be convenient for interactive use that are a bad idea outside of the interactive environment. Errors which pass silently are one of them. Unpacking a sequence of 3 items into 2 assignment targets should be an error, unless you explicitly limit it to only two items. Sure, sometimes it would be convenient to unpack just two items out of some arbitrarily large iterator just be writing `x, y = it`. But other times that would be an error, even in the interactive interpreter. I don't want Python trying to *guess* whether I want to unpack the entire iteratable or just two items. Whatever tiny convenience there is from when Python guesses correctly will be outweighed by the nuisance value of when it guesses wrongly.
while it remains readable, expected, and expressed in a more compact code.
I don't think it is expected behaviour. It is different from the current behaviour, so it will be surprising to everyone used to the current behaviour, annoying to those who like the current behaviour, and a general inconvenience to those writing code that runs under multiple versions of Python. Personally, I would not expect this suggested behaviour. I would be very surprised, and annoyed, if a simple instruction like: x, y = some_iterable behaved differently for iterators and sequences.
There are some Pros: 1. No overhead
No overhead compared to what?
2. Readable and not so verbose code 3. Optimized case for x,y,*z = iterator
The semantics of that are already set: the first two items are assigned to x and y, with all subsequent items assigned to z as a list. How will this change optimize this case? It still needs to run through the iterator to generate the list.
4. Clear way to assign values partially from infinite generators.
It isn't clear at all. If I have a non-generator lazy sequence like: # Toy example class EvenNumbers: def __getitem__(self, i): return 2*i it = EvenNumbers() # A lazy, infinite sequence then `x, y = it` will keep the current behaviour and raise an exception (since it isn't an iterator), but `x, y = iter(it)` will use the new behaviour. So in general, when I'm reading code and I see: x, y = some_iterable I have very little idea of which behaviour will apply. Will it be the special iterator behaviour that stops at two items, or the current sequence behaviour that raises if there are more than two items?
Cons: 1. A special case of how assignment works 2. As with any implicit behavior, hard-to-find bugs
Right. Hard-to-find bugs beats any amount of convenience in the interactive interpreter. To use an analogy: "Sure, sometimes my car suddenly shifts into reverse while I'm driving at 60 kph, sometimes the engine falls out when I go around the corner, and occasionally the brakes catch fire, but gosh the cup holder makes it really convenient to drink coffee while I'm stopped at traffic lights!"
There several cases with "undefined" behavior: 1. Because the items are assigned, from left to right to the corresponding targets, should rhs see side effects during assignment or not?
I don't understand what you mean by this. Surely the behaviour should be exactly the same as if you wrote: x, y = islice(it, 2) What would you do differently, and why?
2. Should this work only for generators or for any iterators?
I don't understand why you are even considering singling out *only* generators. A generator is a particular implementation of an iterator. I can write: def gen(): yield 1; yield 2; yield 3 it = gen() or I can write: it = iter([1, 2, 3]) and the behaviour of `it` should be identical.
3. Is it Pythonic to distinguish what is on the rhs during assignment, or it contradicts with duck typing (goose typing)?
I don't understand this question.
In many cases it is possible to do this right now, but in too verbose way:
x, y = islice(gen(), 2)
I don't think that is excessively verbose. But maybe we should consider allowing slice notation on arbitrary iterators: x, y = it[:2] I have not thought this through in any serious detail, but it seems to me that if the only problem here is the inconvenience of using islice(), we could add slicing to iterators. I think that would be better than having iterators and other iterables behave differently. Perhaps a better idea might be special syntax to tell the interpreter you don't want to run the right-hand side to completion. "Explicit is better than implicit" -- maybe something special like: x, y, * = iterable will attempt to extract exactly two items from iterable, without advancing past the second item. And it could work the same for sequences, iterators, lazy sequences like range, and any other iterable. I don't love having yet another meaning for * but that would be better than changing the standard behaviour of iterator unpacking. -- Steve