Star assignment in iterator way?

Hello. Currently during star assignement the new list is created. What was the idea to produce new list instead of returning an iterator? It seems to me that returning an iterator more suited to the spirit of Python 3. There are three cases: 1. a,b,c,*d = something_iterable 2. *a,b,c,d = something_iterable 3. a,*b,c,d = something_iterable The first one is obvious. For the rest two we always need to iterate through entire iterable to achieve values for b,c,d (or c,d) binding. But this can be done more memory effiecient than currently (may be I'm wrong). And we can iterate in space of last three (or two) variables. Some rough (simplified) Python code: from itertools import islice, chain from collections import deque def good_star_exp(signature, seq): if signature.count('*') > 1: raise SyntaxError('two starred expressions in assignment') vrs = signature.split(',') idx_max = len(vrs) - 1 star_pos, = (i for i,v in enumerate(vrs) if '*' in v) #First case if star_pos == idx_max: head = islice(seq, idx_max) tail = islice(seq, idx_max, None) return chain(head, (tail,)) #Second case elif star_pos == 0: tail = deque(maxlen=idx_max) for seq_idx_max, v in enumerate(seq): tail.append(v) head = islice(seq, 0, seq_idx_max-(idx_max-1)) return chain([head], tail) #Third case else: head = islice(seq, star_pos) tail = deque(maxlen=(idx_max-star_pos)) for seq_idx_max, v in enumerate(seq): tail.append(v) mid = islice(seq, star_pos, seq_idx_max-(idx_max-2)) return chain(head, [mid], tail) ls = range(100000) a,b,c,d = good_star_exp('a,b,c,*d', ls) a,b,c,d = good_star_exp('*a,b,c,d', ls) a,b,c,d = good_star_exp('a,*b,c,d', ls) Of course this version has drawbacks (the first that come to mind): 1. Will *b see change if rhs is some muttable sequence? 2. Will *b one way iterator or somethong like range? But still it seems to me that the "iterator way" has more useful applications. With best regards, -gdg

21.11.17 10:54, Kirill Balunov пише:
Your implementation iterates seq multiple times. But iterable unpacking syntax works with an arbitrary iterable, and iterates it only once. Changing the result of iterable unpacking will break existing code that depends on the result been a list. And you already have mentioned a question about mutable sequence. If these conditions and restrictions suit you, you can use your good_star_exp() in your code or share it with others. But the semantic of iterable unpacking can't be changed.

Your implementation iterates seq multiple times. But iterable unpacking syntax works with an arbitrary iterable, and iterates it only once.
Oh sorry, I know that my implementation iterates seq multiple times, I only provide this to show the idea. It can be much optimized at C level. I just want to understand if it's worth the time and effort.
Changing the result of iterable unpacking will break existing code that depends on the result been a list.
Backward compatibility is an important issue, but at the same time it is the main brake on progress.
And how do you look at something like this (deferred star evaluation)?: a, ?*b, c, d = something_iterable With kind regards, -gdg

21.11.17 11:27, Kirill Balunov пише:
You can implement the first case, but for other cases you will need a storage for saving intermediate items. And using a list is a good option.
This will be not different from a, *b, c, d = something_iterable b = iter(b) There is nothing deferred here. The only possible benefit can be in the case a, b, ?*c = something_iterable But I have doubts that this special case deserves introducing a new syntax.

On Tue, Nov 21, 2017 at 12:27:32PM +0300, Kirill Balunov wrote:
Backward compatibility is an important issue, but at the same time it is the main brake on progress.
"Progress just means bad things happen faster." -- Terry Pratchett, "Witches Abroad" [...]
And how do you look at something like this (deferred star evaluation)?:
a, ?*b, c, d = something_iterable
A waste of effort? How do you defer evaluating the second and subsequent items if you evaluate the final two? Given: def gen(): yield 999 for i in range(100): yield random.random() yield 999 yield 999 then a, ?*b, c, d = gen() has to evaluate all 100 random numbers in order to assign a, c, d all equal to 999. Making b an iterator instead of a list doesn't actually avoid evaluating anything, and it will still require as much storage as a list. The most likely implementation would: - store the evaluated items in a list; - assign iter(the list) as b. I suppose that there could be some way of delaying the calls to random.random() by returning a thunk, but that is likely to be more expensive in both memory and time than a simple list of floats. -- Steven

May be the first thing which I should do is to improve my English:) My main point was that in many cases (in my experience) it is a waste of memory to store entire list for star variable (*b) instead of some kind of iterator or deferred evaluation. How do you defer evaluating the second and subsequent items if you
If I can not copy at Python level, I can 'tee' when 'star_pos' is reached. In my usual practice, the main use that I encounter when see an assignment to a star variable is as a storage which is used only if the other vars match some criterion. With kind regards, -gdg

FWIW, here's something for working with memory-efficient sequences (and generators), which should get more features in the future: pip install git+https://github.com/k7hoven/views Some examples of what it does: py> from views import seq py> seq[::range(3), None, ::"abc", "Hi!"] <sequence view 8: [0, 1, 2, None, 'a', 'b', 'c', 'Hi!'] > py> seq[::range(100)] <sequence view 100: [0, 1, 2, 3, 4, ..., 96, 97, 98, 99] > py> from views import seq, gen py> seq.chain([1, 2, 3], [4, 5, 6]) <sequence view 6: [1, 2, 3, 4, 5, 6] > py> list(gen.chain([1, 2, 3], [4, 5, 6])) [1, 2, 3, 4, 5, 6] py> from views import range py> range(5) range(0, ..., 4) py> range(1, 10, 3) range(1, ..., 7, step=3) py> range(1, ..., 5) range(1, ..., 5) py> range(1, 3, ..., 10) range(1, ..., 9, step=2) Sequences are perhaps more interesting than the generators, which are just there, because I don't want to implicitly try to convert generators/iterators into sequences. I do intend to add at least one *explicit* mechanism. Much of this is thread-safe, but the assumption in general is that one does not modify the original sequences. One problem is that there's no way to efficiently check if the originals have been mutated. Currently it just sometimes checks that the lengths match. This approach can also be a big performance boost because it avoids copying stuff around in memory etc. However, many possible optimizations have not been implemented yet, so there's overhead that can be significant for small sequences. For instance, itertools could be used to optimize some features. ––Koos On Tue, Nov 21, 2017 at 2:35 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
-- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 11/21/2017 3:54 AM, Kirill Balunov wrote:
Right, and easily dealt with with current Python. d = iter(something iterable) a,b,c = islice(d, 3) (or 3 next(d) calls) More typical is to pull one item off the iterator with next(), as is optionally done with csv readers. it = iter(iterable) header = next(it) # such as column names for item in it: process(item)
For the general case, there is no choice but to save in a list. -- Terry Jan Reedy

21.11.17 10:54, Kirill Balunov пише:
Your implementation iterates seq multiple times. But iterable unpacking syntax works with an arbitrary iterable, and iterates it only once. Changing the result of iterable unpacking will break existing code that depends on the result been a list. And you already have mentioned a question about mutable sequence. If these conditions and restrictions suit you, you can use your good_star_exp() in your code or share it with others. But the semantic of iterable unpacking can't be changed.

Your implementation iterates seq multiple times. But iterable unpacking syntax works with an arbitrary iterable, and iterates it only once.
Oh sorry, I know that my implementation iterates seq multiple times, I only provide this to show the idea. It can be much optimized at C level. I just want to understand if it's worth the time and effort.
Changing the result of iterable unpacking will break existing code that depends on the result been a list.
Backward compatibility is an important issue, but at the same time it is the main brake on progress.
And how do you look at something like this (deferred star evaluation)?: a, ?*b, c, d = something_iterable With kind regards, -gdg

21.11.17 11:27, Kirill Balunov пише:
You can implement the first case, but for other cases you will need a storage for saving intermediate items. And using a list is a good option.
This will be not different from a, *b, c, d = something_iterable b = iter(b) There is nothing deferred here. The only possible benefit can be in the case a, b, ?*c = something_iterable But I have doubts that this special case deserves introducing a new syntax.

On Tue, Nov 21, 2017 at 12:27:32PM +0300, Kirill Balunov wrote:
Backward compatibility is an important issue, but at the same time it is the main brake on progress.
"Progress just means bad things happen faster." -- Terry Pratchett, "Witches Abroad" [...]
And how do you look at something like this (deferred star evaluation)?:
a, ?*b, c, d = something_iterable
A waste of effort? How do you defer evaluating the second and subsequent items if you evaluate the final two? Given: def gen(): yield 999 for i in range(100): yield random.random() yield 999 yield 999 then a, ?*b, c, d = gen() has to evaluate all 100 random numbers in order to assign a, c, d all equal to 999. Making b an iterator instead of a list doesn't actually avoid evaluating anything, and it will still require as much storage as a list. The most likely implementation would: - store the evaluated items in a list; - assign iter(the list) as b. I suppose that there could be some way of delaying the calls to random.random() by returning a thunk, but that is likely to be more expensive in both memory and time than a simple list of floats. -- Steven

May be the first thing which I should do is to improve my English:) My main point was that in many cases (in my experience) it is a waste of memory to store entire list for star variable (*b) instead of some kind of iterator or deferred evaluation. How do you defer evaluating the second and subsequent items if you
If I can not copy at Python level, I can 'tee' when 'star_pos' is reached. In my usual practice, the main use that I encounter when see an assignment to a star variable is as a storage which is used only if the other vars match some criterion. With kind regards, -gdg

FWIW, here's something for working with memory-efficient sequences (and generators), which should get more features in the future: pip install git+https://github.com/k7hoven/views Some examples of what it does: py> from views import seq py> seq[::range(3), None, ::"abc", "Hi!"] <sequence view 8: [0, 1, 2, None, 'a', 'b', 'c', 'Hi!'] > py> seq[::range(100)] <sequence view 100: [0, 1, 2, 3, 4, ..., 96, 97, 98, 99] > py> from views import seq, gen py> seq.chain([1, 2, 3], [4, 5, 6]) <sequence view 6: [1, 2, 3, 4, 5, 6] > py> list(gen.chain([1, 2, 3], [4, 5, 6])) [1, 2, 3, 4, 5, 6] py> from views import range py> range(5) range(0, ..., 4) py> range(1, 10, 3) range(1, ..., 7, step=3) py> range(1, ..., 5) range(1, ..., 5) py> range(1, 3, ..., 10) range(1, ..., 9, step=2) Sequences are perhaps more interesting than the generators, which are just there, because I don't want to implicitly try to convert generators/iterators into sequences. I do intend to add at least one *explicit* mechanism. Much of this is thread-safe, but the assumption in general is that one does not modify the original sequences. One problem is that there's no way to efficiently check if the originals have been mutated. Currently it just sometimes checks that the lengths match. This approach can also be a big performance boost because it avoids copying stuff around in memory etc. However, many possible optimizations have not been implemented yet, so there's overhead that can be significant for small sequences. For instance, itertools could be used to optimize some features. ––Koos On Tue, Nov 21, 2017 at 2:35 PM, Serhiy Storchaka <storchaka@gmail.com> wrote:
-- + Koos Zevenhoven + http://twitter.com/k7hoven +

On 11/21/2017 3:54 AM, Kirill Balunov wrote:
Right, and easily dealt with with current Python. d = iter(something iterable) a,b,c = islice(d, 3) (or 3 next(d) calls) More typical is to pull one item off the iterator with next(), as is optionally done with csv readers. it = iter(iterable) header = next(it) # such as column names for item in it: process(item)
For the general case, there is no choice but to save in a list. -- Terry Jan Reedy
participants (5)
-
Kirill Balunov
-
Koos Zevenhoven
-
Serhiy Storchaka
-
Steven D'Aprano
-
Terry Reedy