Re: [Python-ideas] How assignment should work with generators?

On Wed, Nov 29, 2017 at 5:46 AM, Alon Snir <AlonSnir@hotmail.com> wrote:
ChrisA wrote: Hmm. The trouble is that slice assignment doesn't have a fixed number of targets. If you say "x, y = spam", there's a clear indication that 'spam' needs to provide exactly two values; but "A[:] = spam" could have any number of values, and it'll expand or shrink the list accordingly. Rhodri James wrote: Flatly, no. It is better not to ask for things you don't want in the first place, in this case the infinite sequence. Still, don't let me discourage you from working on this. If you can define how such an assignment would work, or even the length of A[:] as an assignment target, I'm not going to dismiss it out of hand. My answer: The idea is to define an alternative assignment rule, that is to assign exactly as many elements as the current length of the lhs object (without expanding or shrinking it). Suppose "?=" is the operator for the alternative assignment rule; A=[None]*2; and "iterator" is any iterator (finite or infinite). In this case, the following code:
A ?= iterator
would behave like this:
A[:] = islice(iterator, 2) # where 2 is the length of A
And as suggested earlier for the case of assignment to a target list, the following code:
x, y ?= iterator
would behave like this:
x, y = islice(iterator, 2) # where 2 is the number of targets
Regarding the length issue: Is there any difficulty in finding the length of a sliced sequence? After all, the range object has a len method. Therefore, the length of A[s] (where s is a slice object) could be evaluated as follows:
len(range(*s.indices(len(A))))

On 28/11/17 23:15, Alon Snir wrote:
You're still getting -1000 from me on this. You want to introduce magic syntax to make writing sloppy code easier. That's two big barriers you aren't getting over. -- Rhodri James *-* Kynesim Ltd

On 28/11/2017 23:15, Alon Snir wrote:
Just a thought but what about a syntax something along the lines of: a, b, *remainder = iterable Where remainder becomes the iterable with the first two values consumed by assigning to a & b. If the iterator has less than 2 values, (in the above case), remaining it should error, if it has exactly 2 then remainder would become an exhausted iterable. Of course the user could also use: a, b, *iterable = iterable Others may differ but this syntax has a lot of similarity to the f(a, b, *args) syntax, possibly enough that most users could understand it. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

On Wed, Nov 29, 2017 at 07:33:54PM +0000, Steve Barnes wrote:
Guido's time machine strikes again. That has worked since 3.3 if not older. (Probably 3.0 or so, but I don't have that on this computer to test it.) py> a, b, *remainder = range(10) py> a 0 py> b 1 py> remainder [2, 3, 4, 5, 6, 7, 8, 9] -- Steve

On 30/11/2017 15:50, Steven D'Aprano wrote:
I had a sneaky feeling that it did, which raises the question of what the bleep this enormous thread is about, since the fundamental syntax currently exists.... -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

On 30 November 2017 at 16:16, Steve Barnes <gadgetsteve@live.co.uk> wrote:
Essentially, it's about the fact that to build remainder you need to copy all of the remaining items out of the RHS. And for an infinite iterator like count() this is an infinite loop. There's also the point that if the RHS is an iterator, reading *any* values out of it changes its state - and 1. a, b, *remainder = rhs therefore exhausts rhs 2. a.b = rhs reads "one too many" values from rhs to check if there are extra values (which the programmer has asserted shouldn't be there by not including *remainder). Mostly corner cases, and I don't believe there have been any non-artificial examples posted in this thread. Certainly no-one has offered a real-life code example that is made significantly worse by the current semantics, and/or which couldn't be easily worked around without needing a language change. Paul

Steven D'Aprano wrote:
This is not quite the same thing -- the rest of the items are extracted and put in a new list. I think Steve Barnes is suggesting that the iterator should be left alone and bound to the * target instead. That would be an incompatible change. -- Greg

On 30/11/2017 22:26, Greg Ewing wrote:
That is what I was thinking of but surely it would be more efficient way to do this for generators and large iterators. Since the practical difference between remainder being the (partly exhausted) iterator and it being a list, (itself an iterator), containing the results of expanding the remainder of the original iterator is slight. The only limitation would be that the syntax a, b, *remainder, c, d = iterator would be illegal. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

Steve Barnes wrote:
A list is *not* an iterator, it's a sequence. There's a huge difference. Items of a sequence can be accessed at random, it can be iterated over multiple times, if it's mutable its contents can be changed, etc. None of that can be done to an iterator. Maybe it would have been a defensible design decision when *-unpacking was introduced, but it's too late to change it now. Even if it weren't, there are arguments against it. It would be inconsistent with the usage of * in an argument list, where the function always receives a new tuple, never an iterator over something passed as a * argument. -- Greg

2017-11-29 22:33 GMT+03:00 Steve Barnes <gadgetsteve@live.co.uk>:
Before I started this thread, not so long ago, I have already asked a question about this semantics [1 <https://mail.python.org/pipermail/python-ideas/2017-November/048027.html>]. But it appears to be very ambiguous in practice for the various rhs: ... x, *y, z = some_iter *x , y, z = some_iter x, y, *z = some_iter And only for the last case it will mean something special. In addition, it is a huge backward compatibility break. Probably, some time ago it was necessary to split this thread into two questions: 1. Philosophical question regarding sequences and iterators. In particular, should they behave differently depending on the context, or, in other words, whether to emphasize their different nature as fixed-size containers and those that are lazily produce values on demand. 2. Additional syntax in the assignment statement for partial extraction of values from the iterable. 2017-11-30 22:19 GMT+03:00 Paul Moore <p.f.moore@gmail.com>:
Mostly corner cases, and I don't believe there have been any non-artificial examples posted in this thread.
Certainly no-one has offered a real-life code example that is made
Yes, in fact, this is a good question, is whether that is sufficiently useful to justify extending the syntax. But it is not about corner cases, it is rather usual situation. Nevertheless, this is the most difficult moment for Rationale. By now, this feature does not give you new opportunities for solving problems. It's more about expressiveness and convenience. You can write: x, y, ... = iterable or, it = iter(iterable) x, y = next(it), next(it) or, from itertools import isclice x, y = islice(iterable, 2) or, x, y = iterable[:2] and others, also in some cases when you have infinite generator or iterator, you should use 2nd or 3rd. In fact, this has already been said and probably I will not explain it better: 2017-11-28 1:40 GMT+03:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:
With kind regards, -gdg

On 1 December 2017 at 09:48, Kirill Balunov <kirillbalunov@gmail.com> wrote:
That's a good summary of the two elements of the discussion here. On (1), I'd say that Python should *not* have context-dependent semantics like this. It's something Perl was famous for (list and scalar contexts) and IMO makes for pretty unreadable code. Python's Zen here is "Explicit is better than implicit". Specifically, having the semantics of the assignment statement vary depending on the type of the value being assigned seems like a very subtle distinction, and not in line with any other statement in the language. On (2), that's something that is relatively simple to debate - all of the normal rules for new syntax proposals apply - what problem does it solve, how much of an improvement over existing ways of solving the problem does the proposal give, how easy is it for beginners to understand and for people encountering it to locate the documentation, does it break backward compatibility, etc... Personally I don't think it's a significant enough benefit but I'm willing to be swayed if good enough arguments are presented (currently the "a, b, ... = value" syntax is my preferred proposal, but I don't think there's enough benefit to justify implementing it).
It's significant to me that you're still only able to offer artificial code as examples. In real code, I've certainly needed this type of behaviour, but it's never been particularly problematic to just use first_result = next(it) second_result - next(it) Or if I have an actual sequence, x, y = seq[:2] The next() approach actually has some issues if the iterator terminates early - StopIteration is typically not the exception I want, here. But all that means is that I should use islice more. The reason i don't think to is because I need to import it from itertools. But that's *not* a good argument - we could use the same argument to make everything a builtin. Importing functionality from modules is fundamental to Python, and "this is a common requirement, so it should be a builtin" is an argument that should be treated with extreme suspicion. What I *don't* have a problem with is the need to specify the number of items - that seems completely natural to me, I'm confirming that I require an iterable that has at least 2 elements at this point in my code. The above is an anecdotal explanation of my experience with real code - still not compelling, but hopefully better than an artificial example with no real-world context :-)
I'm typically suspicious of arguments based on "filling in the gaps" of existing functionality (largely because it's a fault I'm prone to myself). It's very easy to argue that way for features you'll never actually need in practice - so a "completeness" argument that's not backed up with real-world examples of use cases is weak, at least to me. And I've already commented above on my views of the "it would still feel clumsy having to use anything from itertools" argument. Paul

On 28/11/17 23:15, Alon Snir wrote:
You're still getting -1000 from me on this. You want to introduce magic syntax to make writing sloppy code easier. That's two big barriers you aren't getting over. -- Rhodri James *-* Kynesim Ltd

On 28/11/2017 23:15, Alon Snir wrote:
Just a thought but what about a syntax something along the lines of: a, b, *remainder = iterable Where remainder becomes the iterable with the first two values consumed by assigning to a & b. If the iterator has less than 2 values, (in the above case), remaining it should error, if it has exactly 2 then remainder would become an exhausted iterable. Of course the user could also use: a, b, *iterable = iterable Others may differ but this syntax has a lot of similarity to the f(a, b, *args) syntax, possibly enough that most users could understand it. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

On Wed, Nov 29, 2017 at 07:33:54PM +0000, Steve Barnes wrote:
Guido's time machine strikes again. That has worked since 3.3 if not older. (Probably 3.0 or so, but I don't have that on this computer to test it.) py> a, b, *remainder = range(10) py> a 0 py> b 1 py> remainder [2, 3, 4, 5, 6, 7, 8, 9] -- Steve

On 30/11/2017 15:50, Steven D'Aprano wrote:
I had a sneaky feeling that it did, which raises the question of what the bleep this enormous thread is about, since the fundamental syntax currently exists.... -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

On 30 November 2017 at 16:16, Steve Barnes <gadgetsteve@live.co.uk> wrote:
Essentially, it's about the fact that to build remainder you need to copy all of the remaining items out of the RHS. And for an infinite iterator like count() this is an infinite loop. There's also the point that if the RHS is an iterator, reading *any* values out of it changes its state - and 1. a, b, *remainder = rhs therefore exhausts rhs 2. a.b = rhs reads "one too many" values from rhs to check if there are extra values (which the programmer has asserted shouldn't be there by not including *remainder). Mostly corner cases, and I don't believe there have been any non-artificial examples posted in this thread. Certainly no-one has offered a real-life code example that is made significantly worse by the current semantics, and/or which couldn't be easily worked around without needing a language change. Paul

Steven D'Aprano wrote:
This is not quite the same thing -- the rest of the items are extracted and put in a new list. I think Steve Barnes is suggesting that the iterator should be left alone and bound to the * target instead. That would be an incompatible change. -- Greg

On 30/11/2017 22:26, Greg Ewing wrote:
That is what I was thinking of but surely it would be more efficient way to do this for generators and large iterators. Since the practical difference between remainder being the (partly exhausted) iterator and it being a list, (itself an iterator), containing the results of expanding the remainder of the original iterator is slight. The only limitation would be that the syntax a, b, *remainder, c, d = iterator would be illegal. -- Steve (Gadget) Barnes Any opinions in this message are my personal opinions and do not reflect those of my employer. --- This email has been checked for viruses by AVG. http://www.avg.com

Steve Barnes wrote:
A list is *not* an iterator, it's a sequence. There's a huge difference. Items of a sequence can be accessed at random, it can be iterated over multiple times, if it's mutable its contents can be changed, etc. None of that can be done to an iterator. Maybe it would have been a defensible design decision when *-unpacking was introduced, but it's too late to change it now. Even if it weren't, there are arguments against it. It would be inconsistent with the usage of * in an argument list, where the function always receives a new tuple, never an iterator over something passed as a * argument. -- Greg

2017-11-29 22:33 GMT+03:00 Steve Barnes <gadgetsteve@live.co.uk>:
Before I started this thread, not so long ago, I have already asked a question about this semantics [1 <https://mail.python.org/pipermail/python-ideas/2017-November/048027.html>]. But it appears to be very ambiguous in practice for the various rhs: ... x, *y, z = some_iter *x , y, z = some_iter x, y, *z = some_iter And only for the last case it will mean something special. In addition, it is a huge backward compatibility break. Probably, some time ago it was necessary to split this thread into two questions: 1. Philosophical question regarding sequences and iterators. In particular, should they behave differently depending on the context, or, in other words, whether to emphasize their different nature as fixed-size containers and those that are lazily produce values on demand. 2. Additional syntax in the assignment statement for partial extraction of values from the iterable. 2017-11-30 22:19 GMT+03:00 Paul Moore <p.f.moore@gmail.com>:
Mostly corner cases, and I don't believe there have been any non-artificial examples posted in this thread.
Certainly no-one has offered a real-life code example that is made
Yes, in fact, this is a good question, is whether that is sufficiently useful to justify extending the syntax. But it is not about corner cases, it is rather usual situation. Nevertheless, this is the most difficult moment for Rationale. By now, this feature does not give you new opportunities for solving problems. It's more about expressiveness and convenience. You can write: x, y, ... = iterable or, it = iter(iterable) x, y = next(it), next(it) or, from itertools import isclice x, y = islice(iterable, 2) or, x, y = iterable[:2] and others, also in some cases when you have infinite generator or iterator, you should use 2nd or 3rd. In fact, this has already been said and probably I will not explain it better: 2017-11-28 1:40 GMT+03:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:
With kind regards, -gdg

On 1 December 2017 at 09:48, Kirill Balunov <kirillbalunov@gmail.com> wrote:
That's a good summary of the two elements of the discussion here. On (1), I'd say that Python should *not* have context-dependent semantics like this. It's something Perl was famous for (list and scalar contexts) and IMO makes for pretty unreadable code. Python's Zen here is "Explicit is better than implicit". Specifically, having the semantics of the assignment statement vary depending on the type of the value being assigned seems like a very subtle distinction, and not in line with any other statement in the language. On (2), that's something that is relatively simple to debate - all of the normal rules for new syntax proposals apply - what problem does it solve, how much of an improvement over existing ways of solving the problem does the proposal give, how easy is it for beginners to understand and for people encountering it to locate the documentation, does it break backward compatibility, etc... Personally I don't think it's a significant enough benefit but I'm willing to be swayed if good enough arguments are presented (currently the "a, b, ... = value" syntax is my preferred proposal, but I don't think there's enough benefit to justify implementing it).
It's significant to me that you're still only able to offer artificial code as examples. In real code, I've certainly needed this type of behaviour, but it's never been particularly problematic to just use first_result = next(it) second_result - next(it) Or if I have an actual sequence, x, y = seq[:2] The next() approach actually has some issues if the iterator terminates early - StopIteration is typically not the exception I want, here. But all that means is that I should use islice more. The reason i don't think to is because I need to import it from itertools. But that's *not* a good argument - we could use the same argument to make everything a builtin. Importing functionality from modules is fundamental to Python, and "this is a common requirement, so it should be a builtin" is an argument that should be treated with extreme suspicion. What I *don't* have a problem with is the need to specify the number of items - that seems completely natural to me, I'm confirming that I require an iterable that has at least 2 elements at this point in my code. The above is an anecdotal explanation of my experience with real code - still not compelling, but hopefully better than an artificial example with no real-world context :-)
I'm typically suspicious of arguments based on "filling in the gaps" of existing functionality (largely because it's a fault I'm prone to myself). It's very easy to argue that way for features you'll never actually need in practice - so a "completeness" argument that's not backed up with real-world examples of use cases is weak, at least to me. And I've already commented above on my views of the "it would still feel clumsy having to use anything from itertools" argument. Paul
participants (7)
-
Alon Snir
-
Greg Ewing
-
Kirill Balunov
-
Paul Moore
-
Rhodri James
-
Steve Barnes
-
Steven D'Aprano