Mailman 3 iterable.__unpack__ method - Python-ideas

iterable.unpack method

Alex Stewart

Feb. 22, 2013

7:33 p.m.

So when putting together my enum implementation I came up against something I've run into a couple of times in a few different situations, and I thought warranted some discussion as its own proposal: There are times where it would be extremely convenient if there was a way when unpacking an object (using the "a, b, c = foo" notation) for that object to know how many values it needed to provide. I would like to propose the following: When the unpack notation is used, the interpreter will automatically look for an __unpack__ method on the rvalue, and if it exists will call it as: rvalue.__unpack__(min_count, max_count) ... and use the returned value as the iterable to unpack into the variables instead of the original object. (If no __unpack__ method exists, it will just use the rvalue itself, as it does currently) In the case of simple unpacking ("a, b, c = foo"), min_count and max_count will be the same value (3). In the case of extended unpacking ("a, b, c, *d = foo"), min_count would be the smallest length that will not cause an unpacking exception (3, in this case), and max_count would be None. By effectively separating the notion of "unpackable" from "iterable", this would allow for a few useful things: 1. It would allow otherwise non-iterable objects to support the unpack notation if it actually makes sense to do so 2. It would allow for "smart" unpackable objects (such as __ in my enum example), which would "just work" no matter how many values they're required to produce. 3. It would allow (properly designed) infinite (or unknown-possibly-large-size) iterators to be unpackable into a finite sequence of variables if desired (which can only be done with extra special-case code currently) 4. It could make backwards-compatibility easier for APIs which return iterables (i.e. if you want to change a function that used to return a 3-tuple to return a 5-tuple, instead of inherently breaking all existing code that chose to unpack the return value (assuming only 3 items would ever be returned), you could return an unpackable object which will work correctly with both old and new code) Thoughts? --Alex

Attachments:

attachment.htm (text/html — 2.4 KB)

Show replies by date

Terry Reedy

February 2013

9:14 a.m.

On 2/22/2013 2:33 PM, Alex Stewart wrote:

...

Such statements always have a definite number of targets, so you can always arrange for the iterable to produce the required number of items -- unless it cannot. The count can be used as a multiplier, to slice or islice, or as an argument to an iterator class or function (such as generator functions).

...

It would be much easier, and have much the same effect, if unpacking simply requested the minumum number of items and stopped raising a ValueError if the iterator has more items. No new protocol is needed. Guido rejected this as silently masking errors. -- Terry Jan Reedy

Chris Angelico

10:18 a.m.

On Sat, Feb 23, 2013 at 8:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...

What if it were done explicitly, though? Currently:

...

Suppose the last notation were interpreted as "request values for a, b, and c, and then ignore the rest". This would support infinite iterators, and would be an explicit statement from the caller that it's not an error to have more elements. This doesn't solve the backward-compatibility issue, but it would allow a client to specifically engage a forward-compatibility mode (by putting ,* after any unpack that might introduce more elements - of course, you'd need to be confident that you won't care about those elements). It can be done with current code, of course.

...

ChrisA

Nick Coghlan

11:58 a.m.

On Sat, Feb 23, 2013 at 8:18 PM, Chris Angelico <rosuav@gmail.com> wrote:

...

It highly unlikely it will ever be interpreted that way, because it contradicts the way the unnamed star is used in function headers (that is, to indicate the start of the keyword only arguments *without* accepting an arbitrary number of positional arguments). If you want to ignore excess values in unpacking, a double-underscore is the best currently available option (even though it has the downside of not working well with infinite iterators or large sequences):

...

...
...
a,b,c,*__ =range(4)

However, Alex's original idea of an "unpack protocol" (distinct from, but falling back to, the ordinary iterable protocol) definitely has at least a few points in its favour. 1. Iterating over heterogeneous types doesn't really make sense, but unpacking them does. A separate protocol lets a type support unpacking assignment without supporting normal iteration. 2. The protocol could potentially be designed to allow an *iterable* to be assigned to the star target rather than requiring it to be unpacked into a tuple. This could be used not only to make unpacking assignment safe to use with infinite iterators, but also to make it cheaper with large sequences (through appropriate use of itertools.islice in a custom container's __unpack__ implementation) and with virtual sequences like range() objects. 3. As Alex noted, if a function that previously returned a 2-tuple wants to start returning a 3-tuple, that's currently a backwards incompatible change because it will break unpacking assignment. With an unpack protocol, such a function can return an object that behaves like a 3-tuple most of the time, but also transparently supports unpacking assignment to two targets rather than only supporting three. I would suggest a different signature for "__unpack__", though, built around the fact the star target can be used at most once, but in an arbitrary location: def __unpack__(target_len, star_index=None): """Unpack values into an iterable of the specified length. If star_index is not None, the value at that index will be an iterable representing the remaining contents that were not unpacked. """ ... While I'm somewhere between +0 and +1 on the idea myself, there are some open questions/potential downsides to the idea: - this change increases the complexity of the language, by explicitly separating the concept of heterogeneous unpacking from homogeneous iteration. Those *are* two distinct concepts though, so the increased complexity may be justifiable. - for function parameter unpacking, would we allow "*args" to be an arbitrary iterable rather than promising that it is always a tuple? We would probably want to do so for consistency with unpacking assignment under this scheme, which may lead to some confusing error messages thrown by existing functions which expect it to always be a tuple. - we would want to document very clearly that it remains possible to bypass custom unpacking by explicitly calling iter(obj) rather than using obj directly. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Masklinn

6:19 p.m.

On 2013-02-23, at 12:58 , Nick Coghlan wrote:

...

Then again, *args in (Python 3) unpacking is already completely incompatible with *args in functions. That does not strike me as much of an issue.

...

It would also allow preventing unpacking of iterables for which unpacking makes no sense and is most likely an error, such as sets (well there are limited use cases for unpacking a set, but they are few and far between and wrapping the set in an iter() call would allow them).

...

And with immutable sequences through plain sharing. In fact, though that's not supported by either existing __unpack__ proposal, it could even allow for mapping unpacking (and more generally named unpacking) by adding something which tends to be missing from python API: names awareness. Altering your proposal by replacing `target_len` by a sequence of the names on LHS would allow implementing __unpack__ for objects or mappings rather than have to use the (quite unwieldy and backwards-looking) operator.itemgetter and operator.attrgetter.

...

I'm +1 on these grounds, while the current unpacking "works" it makes for odd limitations or corner-cases.

...

I don't think so, they already have a significant semantic split on the meaning of what follows an *args, and I've got the gut feeling that *args being a tuple is a much more common assumption/expectation for functions than for unpacking (especially since *args in unpacking was only introduced in Python 3)

...

Jan Kaliszewski

6:41 p.m.

On 23.02.2013 19:19, Masklinn wrote:

...

Especially that for unpacking it is *a list, not a tuple*. and Cheers. *j PS. And would it be really possible to apply the __unpack__-idea to calls? There had not been and still there is no real consistency between unpacking and call argument binding, e.g. def fun(a, b, c, *args): ... fun(1, 2, with_dunder_unpack) # ^ OK: a = 1; b = 2; c, *args = with_dunder_unpack # but what about: def fun(a, b, c, *args): ... fun(1, 2, 3, 4, 5, with_dunder_unpack) # ^ a = 1; b = 2; c = 3; args = ??? I believe these concepts (unpacking and call argument binding) are simply not really parallel in Python. Yes, `a, b, *seq` in unpacking is *somehow* similar -- on the level of a rough intuition (and that is nice) but IMHO it should not stop us from extending unpacking without worrying much about call binding which was, is and must be (in Python) different in many ways.

Eric Snow

8:10 p.m.

On Sat, Feb 23, 2013 at 4:58 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

This is the big one for me. And as masklinn pointed, it works the other way too (keep an iterable from unpacking).

...

Cool, though it would be complicated when the star target isn't the last target.

...

Agreed. Also, I expect that __unpack__ would be an opt-in API that falls back to iteration.

...

This should be handled/proposed separately, though it is dependent on an unpacking protocol.

...

An unpacking protocol would add power to the language. We really don't have a good way to separate unpacking from iteration. Structured record types like namedtuples aren't exactly sequences, and could make use of the new protocol. (I'm not advocating for the change to namedtuple itself.) -eric

Terry Reedy

12:53 a.m.

On 2/23/2013 6:58 AM, Nick Coghlan wrote:

...

On Sat, Feb 23, 2013 at 8:18 PM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 23, 2013 at 8:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...
It would be much easier, and have much the same effect, if unpacking simply requested the minumum number of items and stopped raising a ValueError if the iterator has more items. No new protocol is needed. Guido rejected this as silently masking errors.

What if it were done explicitly, though?

Currently:

...
...
...
a,b,c=range(3) a,b,c=range(4) Traceback (most recent call last): File "<pyshell#55>", line 1, in <module> a,b,c=range(4) ValueError: too many values to unpack (expected 3) a,b,c,*d=range(6) a,b,c,* =range(4) SyntaxError: invalid syntax

Suppose the last notation were interpreted as "request values for a, b, and c, and then ignore the rest".

I think this is fundamentally the right idea. The 'problem' to be solved, such as it is, is that the multiple assignment unpacker, after requesting the number of values it needs, requests one more that it does not need, does not want, and will not use. If it gets it, it says 'too bad', undoes any work it has done, and raises an error. The only point in doing this is to uncover possible bugs. But some people say they want to knowingly provide extra values and have it not be considered a bug. The solution, if there is to be one, and if that extra behavior is not to be completely undone, is to be able to tell the unpacker to skip the extra check. I strongly feel that the right place to do that is on the target side. This fixes the problem in one place rather than requiring that the solution be duplicated everywhere.

...

It highly unlikely it will ever be interpreted that way, because it contradicts the way the unnamed star is used in function headers (that is, to indicate the start of the keyword only arguments *without* accepting an arbitrary number of positional arguments). If you want to ignore excess values in unpacking, a double-underscore is the best currently available option (even though it has the downside of not working well with infinite iterators or large sequences):

...

...
...
...
a,b,c,*__ =range(4)

If ,* is not acceptible, how about ,** or ,... or ,None or <take your pick>. I rather like 'a, b, c, ... =' as it clearly implies that we are picking and naming the first three values from 3 or more; '...' clearly cannot be an assignment target.

...

However, Alex's original idea of an "unpack protocol" (distinct from, but falling back to, the ordinary iterable protocol) definitely has at least a few points in its favour.

I strongly disagree as I think the two major points are wrong.

...

1. Iterating over heterogeneous types doesn't really make sense, but unpacking them does. A separate protocol lets a type support unpacking assignment without supporting normal iteration.

This, repeated in this

...

- this change increases the complexity of the language, by explicitly separating the concept of heterogeneous unpacking from homogeneous iteration. Those *are* two distinct concepts though, so the increased complexity may be justifiable.

attempt to reintroduce 'heterogeneous type' as a fundamental concept in the language, after being removed with the addition of .count and .index as tuple methods. Since I consider the pairing 'heterogeneous type', to be wrong, especially in Python, I consider this to be a serious regression. Let me repeat some of the previous discussion. In Python 3, every object is an instance of class count. At that level, every collection is homogeneous. At other levels, and for particular purposes, any plural collection might be considered heterogeneous. That is a function of the objects or values in the collection, and not of the collection class itself. So I claim that 'heterogeneous' has not much meaning as an absolute attribute of a 'type'. In any assignment, targets (mostly commonly names) are untyped, or have type 'object' -- take your pick. So iterable (of objects) is the appropriate source. In any case, 'iteration' and 'unpacking' both mean 'accessing the items of a collection one at a time, as individual items rather than as part of a collection'. I do not see any important distinction at all and no justification for complexifying the language again.

...

3. As Alex noted, if a function that previously returned a 2-tuple wants to start returning a 3-tuple, that's currently a backwards incompatible change because it will break unpacking assignment. With an unpack protocol, such a function can return an object that behaves like a 3-tuple most of the time, but also transparently supports unpacking assignment to two targets rather than only supporting three.

This seems to claiming that it is sensible to change the return type of a function to a somewhat incompatible return type. (I am obviously including fixed tuple length in 'return type', as would be explicit in other languages). I believe many/most/all design and interface experts would disagree and would say it would be better to give the new function a new name. The statement "that's currently a backwards incompatible change because it will break unpacking assignment." is a gross understatement that glosses over the fact that returning a 3-tuple instead of a 2-tuple will breaks lots of things. Just two examples: def foo(): return 1,2 def bar(): return tuple('abc') foobar = foo() + bar() oof = reversed(foo()) Changing foo to return 1,2,'x' will mess up both foobar and oof for most uses. --

...

2. The protocol could potentially be designed to allow an *iterable* to be assigned to the star target rather than requiring it to be unpacked into a tuple. This could be used not only to make unpacking assignment safe to use with infinite iterators, but also to make it cheaper with large sequences (through appropriate use of itertools.islice in a custom container's __unpack__ implementation) and with virtual sequences like range() objects.

The problem with *x, that it is inefficient, not applicable to infinite iterators, and that assigning the iterable directly to x when *x is in final position is more likely what one wants anyway, is a different issue from the unwanted bug check and exception. Again, I think the solution is an explicit notation on the target side. Perhaps '**x' or 'x*' or something based on _. If you do not like any of those, suggest another. -- Terry Jan Reedy

Chris Angelico

1:36 a.m.

On Sun, Feb 24, 2013 at 11:53 AM, Terry Reedy <tjreedy@udel.edu> wrote:

...

I like that. It avoids the confusion with augmented assignment that ,* has (yes, you can avoid the *syntactic* confusion by putting spaces around the equals sign, but it's still visually similar). The only problem might be in documentation, where it might be taken to mean "a, b, c, d, e, and as many more variables as you want", eg indicating that tuple unpacking works with any number of targets. ChrisA

Nick Coghlan

2:12 a.m.

I definitely like Terry's idea of using ", ..." to say "and ignore the rest". Simple, elegant and can be restricted to the last element so it works with infinite iterators and large sequences. It also allows incremental unpacking of ordinary iterators. I still like the idea of an unpacking protocol as well, but the above would cover a lot of use cases without needing a new protocol, and thus should be considered first (since a new protocol is a higher impact change than Terry's syntax suggestion). Cheers, Nick.

Nick Coghlan

5:40 a.m.

On Sun, Feb 24, 2013 at 12:12 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

I want to expand on this a bit now I'm back on a real computer, since it wasn't immediately obvious to me how well the ", ..." proposal supports incremental unpacking, but it became clear once I thought of this simple example: iterargs = iter(args) command, ... = iterargs # Grab the first element, leave the rest in the iterator commands[command](*iterargs) # Pass the rest to the command Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Guido van Rossum

6:24 a.m.

But is that really so much better than command = next(iterargs) # etc. ? And to me the ... Looks too much like something that consumes the rest, rather than leaving it. --Guido (not on a real keyboard) On Saturday, February 23, 2013, Nick Coghlan wrote:

...

-- --Guido van Rossum (python.org/~guido)

Nick Coghlan

7:38 a.m.

On Sun, Feb 24, 2013 at 4:24 PM, Guido van Rossum <guido@python.org> wrote:

...

For a single item, using next() instead is OK, the problem is that itertools islice is relatively ugly and unintuitive, and requires you to manually count the number of targets. The next + islice combination also creates a discontinuity between the way you handle getting just the first item, versus getting multiple items. Status quo, 1 item: iterargs = iter(args) command = next(iterargs) # Grab the first, leave the rest commands[command](*iterargs) # Pass the rest to the command Status quo, 2 items (etc): from itertools import islice iterargs = iter(args) command, subcommand = islice(iterargs, 2) # Grab the first two, leave the rest commands[command][subcommand](*iterargs) # Pass the rest to the subcommand Proposal, 1 item: iterargs = iter(args) command, ... = iterargs # Grab the first, leave the rest commands[command](*iterargs) # Pass the rest to the command Proposal, 2 items (etc): iterargs = iter(args) command, subcommand, ... = iterargs # Grab the first two, leave the rest commands[command][subcommand](*iterargs) # Pass the rest to the subcommand

...

And to me the ... Looks too much like something that consumes the rest, rather than leaving it.

As far as the possible interpretation being "consume and discard the rest" goes, I think it's one of those cases where once you know which it is, it won't be hard to remember. It's certainly something we could continue to live without, but I do think it's a nice way to expand the unpacking syntax to cover additional use cases far more elegantly than the current reliance on a combination of next and itertools.islice. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Larry Hastings

11:25 a.m.

On 02/23/2013 11:38 PM, Nick Coghlan wrote:

...

Or command, subcommand = next(iterargs), next(iterargs) or command = next(iterargs) subcommand = next(iterargs) These don't require the manual counting, nor do they feature the "discontinuity" you mentioned. FWIW I don't think this problem is bad enough, nor this idiom used frequently enough, to warrant new syntax. But I've been wrong before. I was musing on this topic and came up with what is probably a terrible alternate approach. What if there was some way for the RHS to know what the LHS was asking for in a destructuring assignment? Something like def __next_n__(self, n): "Returns a tuple of the next n items from this iterator." and if the object doesn't provide __next_n__ you fall back to the current explicit PyIter_Next() calls and failing if there aren't enough / there are too many. If we had this, we could make the second argument to islice() optional; it could infer how many items you wanted. (I don't think the "a, b, *c, d = rhs" form is relevant here as you'd always consume everything from rhs.) I'm not sure how bad an idea this is. But complicating the iterator protocol for this marginal problem seems like a pretty bad idea. //arry/

Chris Angelico

12:59 p.m.

On Sun, Feb 24, 2013 at 10:25 PM, Larry Hastings <larry@hastings.org> wrote:

...

Or

command, subcommand = next(iterargs), next(iterargs)

Err.... is there a language guarantee of the order of evaluation in a tuple, or is this just a "CPython happens to evaluate independent expressions left-to-right"? This is totally broken if the next() calls could be done in either order. ChrisA

Steven D'Aprano

2:16 p.m.

On 24/02/13 23:59, Chris Angelico wrote:

...

It's a language guarantee. http://docs.python.org/2/reference/expressions.html#evaluation-order -- Steven

Chris Angelico

2:52 p.m.

On Mon, Feb 25, 2013 at 1:16 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Ah, so it is. My bad, sorry! In that case, sure, this works. It still violates DRY though, naming the iterable twice and relying on the reader noticing that that means "take two off this one". But that's a much weaker concern. ChrisA

Vito De Tullio

3:45 p.m.

Chris Angelico wrote:

...

...
...
...
command, subcommand = next(iterargs), next(iterargs)

...

well, about the DRY it's the same, but I found the "old stupid" command = next(iterargs) subcommand = next(iterargs) mostly more explicit in the "take two off this one" maybe it's... too much explicit (and it's not a single expression!!! anathema!) but I found slightly more readable -- ZeD

Guido van Rossum

4:07 p.m.

DRY is not the only principle to consider here. On Sunday, February 24, 2013, Chris Angelico wrote:

...

-- --Guido van Rossum (python.org/~guido)

Wolfgang Maier

10:32 a.m.

Nick Coghlan <ncoghlan@...> writes:

...

When it comes to getting multiple items from an iterator, I prefer wrapping things in my own generator function: def x_iter(iterator,n): """Return n items from iterator.""" i=[iterator]*n while True: try: result=[next(i[0])] except StopIteration: # iterator exhausted immediately, end the generator break for e in i[1:]: try: result.append(next(e)) except StopIteration: # iterator exhausted after returning at least one item, but before returning n raise ValueError("only %d value(s) left in iterator, expected %d" % (len(result),n)) yield result Compared to islice, this has the advantage of working properly in for loops:

...

1 2 3 4 5 6 7 8 9 10 Maybe one could improve itertools.islice accordingly??

...

+1 for this. I think it's very readable. I think it should raise differently though depending on whether iterargs is exhausted right away (StopIteration) or during unpacking (ValueError). Best, Wolfgang

Bruce Leban

8:04 a.m.

On Sat, Feb 23, 2013 at 10:24 PM, Guido van Rossum <guido@python.org> wrote:

...

And hypothetically a, b, c = next(iterargs, count=3) or a, b, c = (next * 3)(iterargs) --- Bruce

Steven D'Aprano

9:08 a.m.

On 24/02/13 19:04, Bruce Leban wrote:

...

That looks like, well, I'm not sure what it looks like. Haskell maybe? But not Python. -- Steven

Andrew Barnert

2:56 a.m.

On Feb 24, 2013, at 1:08, Steven D'Aprano <steve@pearwood.info> wrote:

...

Not Haskell either. If you told me this was an expression in a language that's like Haskell but with python (or C) syntax, I'd expect it to mean something like compose_n(next, 3)(iterargs), aka next(next(next(iterargs))), not map(next, iterargs * 3). Maybe next(3, iterargs) or next(3)(iterargs), but that's just islice.

Oscar Benjamin

4:42 p.m.

On 24 February 2013 06:24, Guido van Rossum <guido@python.org> wrote:

...

Calls to next() need to be wrapped in try/except StopIteration (it's a bad idea to leak these exceptions). So then it becomes: try: command = next(iterator) except StopIteration: raise ValueError('need more than 0 values to unpack') Presumably with this proposal command, ... = iterator would raise a ValueError if the iterator is empty. Oscar

Jan Kaliszewski

8:14 p.m.

24.02.2013 17:42, Oscar Benjamin wrote:

...

Not necessarily. You can use: command = next(iterator, None) Cheers. *j

MRAB

8:34 p.m.

On 2013-02-24 20:14, Jan Kaliszewski wrote:

...

That's not the same thing. Normally when unpacking, it raises ValueError if there are too few items. That's what we want to preserve.

Alex Stewart

10:07 p.m.

It seems to me this thread has actually split into discussing several different, fairly independent issues. ==== Unpacking protocol First, there is the question I originally raised about separating the "unpacking protocol" from the "iterator protocol". I would like to clarify a couple of points on this: I proposed __unpack__ the way I did specifically for a few reasons: 1. It is a fairly small change to the language (it does not require any new syntax, does not change the behavior of any current code, and can be completely ignored by programmers who don't care about the issues it addresses) 2. There are (in my opinion) already pretty good parallels with this sort of change that have already been implemented in Python (for example, separating the concept of "iterable" from the concept of "indexable"). This seemed to be a fairly natural and easy-to-explain extension of the same sort of philosophies that Python has already largely adopted in other contexts. 3. It is consistent with the general idea in Python that an object can choose to emulate whichever behaviors of other objects make sense, and not emulate other ones that do not make sense. In my opinion, "iterating", and "unpacking" are two related-but-not-the-same language concepts, which currently are unnecessarily entangled. 4. It would help significantly with several types of problems that do not currently have great solutions. People have proposed various alternatives that would help with one or two of these, but none of the proposed alternatives deal with all of them (or really even most of them). In summary, some of the other proposals have other useful features, and might be worth considering, but many of them seem far more of a stretch than my original proposal, and I think there are other good reasons why the original proposal is still worth looking at on its own merits. Also, regarding the method signature "(min_count, max_count)" vs "(target_len, star_index)". I specifically discarded the latter in favor of the former when I was putting together the proposal because (a) the "star_index" way does not really gain us anything useful over the "min/max" way, (b) "min/max" seemed a bit more future-proof (if, in the future, for example, somebody came up with an extended-extended unpack syntax which allowed for (for example) optional parameters, or the ability to specify an upper-bound on the *extra stuff, it would not require any change to the __unpack__ protocol to support it), (c) "min/max" is less about the particular implementation and more about conceptually what it means to unpack something, which seemed more appropriate when defining the concept of an "unpack protocol", and (d) this also means that it could conceivably be called in other cases (or directly by Python code that knew what it was doing) to produce more sophisticated results for some objects. Ultimately, I could live with "target_len, star_index", but I still think "min_count, max_count" is a better way to do things, and I'd be curious whether anybody can come up with an argument why "target_len, star_index" would be actually more useful or preferable in any way.. ==== Partial unpacking Second, there is the question of allowing explicit (or implicit) "partial unpacking" (i.e. allowing the programmer to say "I only want to unpack the first n items, and leave any others in the iterator) I agree with the general sentiment that doing this implicitly (i.e. just not checking for "too many values to unpack") is silently masking errors, which is a bad idea, and anyway it would potentially cause problems with all kinds of existing code, and we just shouldn't go down that road. I do think it might be nice to have a way to specify in the syntax to just leave any extra data un-iterated, but to be honest I haven't really felt any particular warm-fuzzies about any of the proposed syntaxes thus far. (The one I like best is Terry's "a, b, c, ... = foo", though.) However, this is really fundamentally a completely different question, which addresses a different use-case, than the unpacking protocol question, and should not be confused as addressing the same issues (because it doesn't). Specifically, lots of folks seem to be ignoring the fact that any "unpacking protocol" (whether it's the implicit one we have now, or a more explicit alternative like __unpack__) has two sides to it (the provider and the consumer). These two sides are not always written by the same person or expecting the same things (for example, when unpacking an object which was passed into a routine or returned by some other code). There are times when the person writing the assignment statement does know "I only want these items and not the rest", but there are also times when the object itself potentially knows better "only these items make sense in this context". I would emphasize that there are currently ways (even if they're less than ideal) for the former to be done (i.e. calling next() or using itertools), but there is currently no way to do the latter at all (because the object is currently not given any way to get any information about the context it's being used in). We can make the former case more convenient if we want to, but it really doesn't fix the other side of the problem at all. ==== Iterator end-testing The third issue that has been brought up is the slightly-related issue of how unpacking currently attempts to consume an extra item from an iterator, just to check for length. This is, in my opinion, a valid flaw which is worth discussing, but it is actually not a flaw with unpacking, or really related in any way to unpacking except by happenstance. The real issue here is not in how unpacking is done but a limitation in the iteration protocol itself. The problem is that there is no standard way to query an iterator to find out if it has more data available without automatically consuming that next data element in the process. In my opinion, if we want to deal with this issue properly, we should not be trying to hack around it in the single case of unpacking but instead fix the larger core problem by adding some way to check for the end of an iterator without side-effects (and then of course change unpacking to use that instead). --Alex

Terry Reedy

3:43 a.m.

On 2/24/2013 5:07 PM, Alex Stewart wrote:

...

==== Unpacking protocol

First, there is the question I originally raised about separating the "unpacking protocol" from the "iterator protocol". I would like to clarify a couple of points on this:

I proposed __unpack__ the way I did specifically for a few reasons:

1. It is a fairly small change to the language

To me, taking something simple and elegant and complexifying it, to little benefit I can see other than to solve a problem in the wrong place, in no small matter. Certainly, adding a new special method to some category of objects has not effect unless changes are made to the interpreter elsewhere to make automatic use of that method. Otherwise, you could just define a normal method to be used by user code.

...

2. There are (in my opinion) already pretty good parallels with this sort of change that have already been implemented in Python (for example, separating the concept of "iterable" from the concept of "indexable").

The related but distinct concepts of sequential access and random access are basic to computation and were *always* distinct in Python. For loop statements were always the way to sequentially access objects in a collection (by default, all of them). It happens that the original *implementation* of sequential access, use by for loops, was by means of pseudo random access. This worked ok for randomly accessible sequences but was clumsy for anything else. (Sequential access requires memory, random access does not.) The addition of __iter__ along with __next__ added more benefits than __next__ alone would have.

...

3. It is consistent with the general idea in Python that an object can choose to emulate whichever behaviors of other objects make sense, and not emulate other ones that do not make sense. In my opinion, "iterating", and "unpacking" are two related-but-not-the-same language concepts, which currently are unnecessarily entangled.

I do not see that at all. As near as I can tell, both mean 'sequential access'. You are that one that seems to me to be entangling 'bind objects to targets one at a time' with 'provide objects one at a time'. The point of both old and new iterator protocols was and is to decouple (disentangle) consumers from providers. At one time, target-list binding *was* entangled -- with tuples -- because it did not use the iterator protocol that for statement did. I admit I do not understand your __unpack__ proposal, since it seemed vague and incomplete to me. But until I see your conceptual distinction, the details do not matter to me.

...

4. It would help significantly with several types of problems that do not currently have great solutions.

I obviously do not see the same pile of problems that would justify a new special method. ...

...

no way to query an iterator to find out if it has more data available

See how in my separate post: "Add lookahead iterator (peeker) to itertools" -- Terry Jan Reedy

Alex Stewart

10:25 p.m.

On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote:

...

On 2/24/2013 5:07 PM, Alex Stewart wrote:

...
1. It is a fairly small change to the language

To me, taking something simple and elegant and complexifying it, to little benefit I can see other than to solve a problem in the wrong place, in no small matter.

I did not say it was a small matter. I said it was a small change to the language. From a purely practical perspective of what changes would be required to the language itself in order to support this proposal, in my opinion, it would not require that much to implement (it would probably only require a couple of lines of new code and some doc changes), and is very unlikely to affect any existing code or tools. (Syntax changes are, in my opinion, far more disruptive changes to the language: They require a change to the parser as well as the interpreter, and potentially changes to all manner of other utilities (debuggers/code-checkers/AST-manipulators/etc) out there in the wild. They make it much harder to write Python code that supports older Python releases (with method changes, you can check for the existence of a feature in the interpreter and selectively use or not use it, but you typically can't use new syntax constructs in anything that might be run in an older Python release, because the parser will bomb out instead). They also require programmers to learn new syntax (if nothing else, to understand what it means when they come across it in somebody else's code). What's more, every syntax change potentially limits what can be done with the syntax in the future (for example, by using up special characters or constructs so they're not available for other things), and so on. Personally, I think there should be a much higher bar for syntax changes than for adding a new special method, not the other way around.) You also seem to be defining "in the wrong place" as "in a different way than I personally have a use for".. As I pointed out (which you seem to have completely skipped over in your reply), there are use cases where the consumer side is the right (or only) place, and there are other different use cases where the producer side is the right (or only) place. The two are not mutually-exclusive, and improving one does not really address the other issue at all. I want to emphasize that I am not saying that we shouldn't extend unpacking notation to support partial-unpacking; in fact I think it might well be a good idea. What I am saying, however, is that if we do, I do think it's a much bigger change to the language than what I proposed, and it still doesn't actually address most of the issues that __unpack__ was intended to deal with, so it's really not an either-or sort of thing anyway. Certainly, adding a new special method to

...

some category of objects has not effect unless changes are made to the interpreter elsewhere to make automatic use of that method. Otherwise, you could just define a normal method to be used by user code.

My proposal does not include adding a new special method to any existing objects. It adds the _ability_ for people to add a special method to objects *in the future* and have the interpreter take advantage of it. It therefore should have no effect for any existing code.

...

2. There are (in my opinion) already pretty good parallels with this

...
sort of change that have already been implemented in Python (for example, separating the concept of "iterable" from the concept of "indexable").

The related but distinct concepts of sequential access and random access are basic to computation and were *always* distinct in Python.

Oh really? Then how, prior to the development of the iterator protocol, did one define an object which was accessible sequentially but not randomly in the Python language? If you couldn't do that, you can't claim that the concepts were really distinct in Python, in my opinion. The truth is that the concept of an iterable object vs. an indexable object may have been distinct *in some people's minds*, but it was *not* originally distinct in the Python language itself. People decided that that should change, so the language was extended to make that happen. It should also be noted that the two concepts were not distinct in *everyone's* minds (I remember straightening out confusion on this point in numerous people's understanding (and still need to occasionally), particularly those who did not come from a classical computer-science background). However, just because there were some people who didn't see them as distinct did not mean that it was an invalid way for other people to view them, or that they shouldn't ultimately be distinct in the language.. With all respect (and I do actually mean that), your primary argument against it seems to be one of intellectual complacency: "I've never considered that they might be different, so they shouldn't ever be separated, because then I don't have to change the way I think about things". The addition of __iter__ along with __next__

...

added more benefits than __next__ alone would have.

You seem to be making my point for me. In essence, __unpack__ is really just the equivalent for the unpacking operation to what __iter__ is for "for" loops. Is it strictly required in order to support the basic behavior? No (you could do for loops over simple iterables with just a variant of __next__, if you really wanted to), but it allows some more sophisticated objects to provide more useful behaviors by giving them more context about how they are being invoked (in the case of __iter__, by telling them ahead of time that they are going to be used in an iteration context, and allowing them to create or alter the initial object that the for loop interacts with). This provides "more benefits than __next__ alone would have", so it was added to the language. Logically, by the same argument, and in almost exactly the same way, __unpack__ provides more benefits than __iter__ alone would have (by telling them that they are going to be used in a particular unpacking context, and allowing them to alter the object used for that type of iteration), and therefore should arguably be part of the language as well.

...

3. It is consistent with the general idea in Python that an object can

...
choose to emulate whichever behaviors of other objects make sense, and not emulate other ones that do not make sense. In my opinion, "iterating", and "unpacking" are two related-but-not-the-same language concepts, which currently are unnecessarily entangled.

I do not see that at all. As near as I can tell, both mean 'sequential access'.

Not true. Iteration means "sequential access". Unpacking means "sequential access with inherent length constraints", which is the bit you seem to be ignoring. The most notable consequence of this (but definitely not the only one) is that a for loop (for example) will never raise ValueError, but the unpacking operation will, if it gets what it considers "wrong" results back from the unpacked object. In this way the two operations do not interpret the data that the object in question provides in the same way, and it does not really have the same meaning. Yes, from a purely mechanical perspective, they perform similar operations, but programming interfaces are not solely defined by mechanics. The difference here is one of implied contracts. In simple iteration, there is explicitly no minimum or maximum number of values that the iterable is required to produce, and in fact the entire protocol is based on the assumed perspective that returning a value is always the "success" condition and not being able to return one is the "failure" case, and in that context it makes sense to design iterators so that they should always return more data if they're able to do so. In unpacking, however, when it gets to the end of the variable list, the conditions are reversed, and actually returning a value becomes a failure condition. This change of meaning, however, is not communicated in any way to the iterator (this is just one example: there are similar contract issues inherent in extended unpacking as well). This is, in my opinion, a flaw in the interface, because the language is misrepresenting (bounded) unpacking as (unbounded) iteration when calling the iterable object. I admit I do not understand your __unpack__ proposal, since it seemed

...

vague and incomplete to me. But until I see your conceptual distinction, the details do not matter to me.

I'm not exactly sure why you consider it vague or incomplete. I provided a specific description of exactly what changes to the language I was proposing, with method signatures, a description of the new interpreter behavior and examples of data values in typical scenarios, and accompanied it with a list of several different use-cases which I believed it would provide significant utility. I'm not really sure how I could have gotten more specific without providing a diff against the CPython sources (which seems a bit much for an initial proposal)..

...

4. It would help significantly with several types of problems that do

...
not currently have great solutions.

I obviously do not see the same pile of problems that would justify a new special method.

Obviously, although I'm not really sure why not because I did explicitly state a bunch of them in my original post. You seem to have just ignored all of them except for the issue of partial unpacking, which frankly was not even the most important or interesting use case I presented, in my opinion. I would recommend you might want to go back and read the last part of my initial post where I presented several potential use cases for all this. I do get the distinct impression, however, that you've already made up your mind and, right or wrong, there's nothing I could possibly say which would ever change it, so at this point I'm not sure whether it really matters (sigh)...

...

no way to query an iterator to find out if it has more data available

See how in my separate post:

...

"Add lookahead iterator (peeker) to itertools"

I'm not necessarily opposed to that proposal, but I should point out that it does not actually address the problem I mentioned in this thread at all, so it's really kinda irrelevant to that discussion. (Even if that is added to itertools, unpacking still won't use it, nor does it provide any way to safely query iterables which produce side-effects, even if there might be a way for them to provide that info without the side-effects, nor does it provide a way to test for end conditions without changing the underlying iterable (which might be used elsewhere in scopes outside the lookahead-wrapper), etc..) --Alex

Terry Reedy

12:44 a.m.

On 2/25/2013 5:25 PM, Alex Stewart wrote: Negative putdowns are off-topic and not persuasive. I starting with 1.3 in March 1997 and first posted a month later. While cautious about changes and additions, I have mostly been in favor of those that have been released. I started using 3.0 in beta and starting using 3.3.0 with the first alpha to get the new stuff.

...

On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote: The related but distinct concepts of sequential access and random access are basic to computation and were *always* distinct in Python.

Oh really?

Yes really!

...

Then how, prior to the development of the iterator protocol, did one define an object which was accessible sequentially but not randomly in the Python language?

As I said, by using the original fake-getitem iterator protocol, which still works, instead of the newer iter-next iterator protocol. Take any python-coded iterator class, such as my lookahead class. Remove or comment out the 'def __iter__(self): return self' statement. Change the header line 'def __next__(self):' to 'def __getitem__(self, n):'. Instances of the revised class will *today* work with for statements. Doing this with lookahead (3.3):

...

...
...
for item in lookahead('abc'): print(item)

a b c

...

...
...
dir(lookahead) ['_NONE', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_set_peek']

See, no __iter__, no __next__, but a __getitem__ that ignores the index passed. Here is where it gets a bit crazy.

...

...
...
s = lookahead('abc') s[0] 'a' s[0], s[300] ('b', 'c') s[0] Traceback... StopIteration

Still, I think enabling generators was more of a motivation than discouraging nonsense like the above (which, obviously, is still possible! but also which people did not intentionally do except as a demonstration). A practical difference between sequential and random access is that sequential access usually requires history ('state'), which random access does not. The __iter__ method and iter() function allows the separation of iterators, with iteration state, from an underlying concrete collections (if there is one), which usually does not need the iteration state. (Files, which *are* iterators and which do not have a separate iterator class, are unusual among builtins.) This enables multiple stateful iterators and nested for loops like this:

...

...
...
s = 'abc' for c in lookahead(s): for d in lookahead(s): print((c,d))

('a', 'a') ('a', 'b') ('a', 'c') ('b', 'a') ('b', 'b') ('b', 'c') ('c', 'a') ('c', 'b') ('c', 'c')

...

If you couldn't do that, you can't claim that the concepts were really distinct in Python, in my opinion.

But I and anyone could and still can, so I can and will make the claim and reject arguments based on the false counter-claim. --- Back to unpack and informing the source as to the number n of items expected to be requested. Case 1. The desired response of the source to n is generic. Example a: we want the source to simply refuse to produce more than the number specified at the start. The current generic solutions are to either make the exact number of explicit next() calls needed or to wrap the iterator in islice(iterator, n) (which in turn will make the number of next() calls needed). Example b : if the source does yield n items, we want it to then yield the residual iterator. Again, a simple generic wrapper does the job. import itertools def slice_it(iterable, n): "Yield up to n items from iterable and if successful, the residual iterator:" it = iter(iterable) for i in range(n): yield next(it) else: yield it a, b, c, rest = slice_it(itertools.count(), 3) print(a, b, c, rest) d, e = itertools.islice(rest, 2) print(d, e)

...

...
...
0 1 2 count(3) 3 4

Conclusion: generic iterator behavior modification should be done by generic wrappers. This is the philosophy and practice of comprehensions, built-in wrappers (enumerate, filter, map, reversed, and zip), and itertools. Case 2. The desired response is specific to a class or even each instance. Example: object has attributes a, b, c, d (of lesser importance), and can calculate e. The pieces and structures it should yield depend on n as in the following table. 1 ((a,b),c) 2 (a,b), c 3 a, b, c 4 a, b, c, d 5 a, b, c, d, e Solution: write a method, call it .unpack(n), that returns an iterator that will produce the objects specified in the table. This can be done today with no change to Python. It can be done whether or not there is a .__iter__ method to produce a generic default iterator for the object. And, of course, xxx.unpack can have whatever signature is appropriate to xxx. It seems to me that this procedure can handle any special collection or structure breakup need. Comment 1: if people can pass explicit counts to islice (or slice_it), they can pass explicit counts to .unpack. Comment 2: we are not disagreeing that people might want to do custom count-dependent disassembly or that they should be able to do so. It can already be done. Areas of disagreement: 1. consumer-source interdependency: you seem to think there is something special about the consumer assigning items to multiple targets in one statement, as opposed to doing anything else, including doing the multiple assignments in multiple statements. I do not. Moreover, I so far consider introducing such dependency in the core to be a regression. 2. general usefulness: you want .unpack to be standardized and made a special method. I think it is inherently variable enough and and the need rare enough to not justify that. I retrieved your original post and plan to look at it sometime to see if I missed anything not covered by the above. But enough for today. -- Terry Jan Reedy

Alex Stewart

8:58 p.m.

On Tue, Feb 26, 2013 at 4:44 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...

On 2/25/2013 5:25 PM, Alex Stewart wrote:

Negative putdowns are off-topic and not persuasive.

None of my statements were intended to be put-downs, and if it came across that way I apologize. It is possible that my frustration showed through on occasion and my language was harsher than it should have been..

...

I starting with 1.3 in March 1997 and first posted a month later.

For what it's worth (not that I think it's terribly important, really), while I have not generally participated much in online discussions over the years, I actually started using Python about the same time you did, and have watched it evolve over a similar timespan..

...

Then how, prior to the development of the iterator protocol,

did one define an object which was accessible sequentially but not

...
randomly in the Python language?

As I said, by using the original fake-getitem iterator protocol, which still works, instead of the newer iter-next iterator protocol.

[large snip] Ok, yes, it was technically possible with ugly hacks to do it, but I don't consider that to be really the same as "supported by the language". The only way you could do it was by making an object "fake" a random-access interface that actually likely did the wrong thing if accessed randomly. That is not a separation of concepts, that is a twisting of one concept in an incompatible way into the semblance of another as a workaround to the fact that the two concepts are not actually distinct in the language design. Once again, you're confusing "how it's thought of in the particular programmer's mind" with "how the language is actually designed", which are not the same thing. Back to unpack and informing the source as to the number n of items

...

expected to be requested.

Case 1. The desired response of the source to n is generic.

If I'm reading you right, I believe this is basically equivalent to "the consumer decides what behavior is most useful". In this case, yes, the correct place to do this is on the consumer side, which is not what __unpack__ is intended to be for anyway (fairly obviously, since it is not a consumer-side change). (For what it's worth, you seem to do a decent job here of arguing against your own propositions to change the unpack syntax, though.. Not sure if that was intentional..)

...

Case 2. The desired response is specific to a class or even each instance.

[...] Solution: write a method, call it .unpack(n), that returns an iterator that

...

will produce the objects specified in the table. This can be done today with no change to Python. It can be done whether or not there is a .__iter__ method to produce a generic default iterator for the object. And, of course, xxx.unpack can have whatever signature is appropriate to xxx. It seems to me that this procedure can handle any special collection or structure breakup need.

This solution works fine, in the very restrictive case where the programmer knows exactly what type of object they've been given to work with, is aware that it provides this interface, and knows that it's important in that particular case to use it for that particular type of object. If the consumer wants to be able to use both this type of object and other ones that don't provide that interface, then their code suddenly becomes a lot more complicated (having to check the type, or methods of the object, and conditionally do different things). Even if they decide to do that, since there is no established standard (or even convention) for this sort of thing, they then run the risk of being given an object which has an "unpack" method with a completely different signature, or worse yet an object which defines "unpack" that does some completely different operation, unrelated to this ad-hoc protocol. Any time somebody produces a new "smart unpackable" object that doesn't work quite the same as the others (either deliberately, or just because the programmer didn't know that other people were already doing the same thing in a different way), it is quite likely that all of the existing consumer code everywhere will have to be rewritten to support it, or (much more likely) it will be supported haphazardly some places and not others, leading to inconsistent behavior or outright bugs. Even ignoring all of this, it still isn't possible to write an object which has advanced unpacking behavior that works with any existing unpacking consumers, such as libraries or other code not under the object-designer's control. In short, yes, it solves things, in a very limited, incompatible, painful, and largely useless way. Comment 2: we are not disagreeing that people might want to do custom

...

count-dependent disassembly or that they should be able to do so. It can already be done.

I disagree that it can already be done in the manner I'm describing, as I've explained. There is, frankly, no existing mechanism which allows an unpacking-producer to do this sort of thing in a standard, consistent, and interchangeable way, and there is also no way at all to do it in a way that is compatible with existing consumer code.

...

Areas of disagreement:

1. consumer-source interdependency: you seem to think there is something special about the consumer assigning items to multiple targets in one statement, as opposed to doing anything else, including doing the multiple assignments in multiple statements.

This is not a matter of opinion. It is a *fact* that there is a difference in this case. The difference is, quite simply, that through the use of the unpacking construct, the programmer has given the Python interpreter additional information (which the interpreter is not providing to the producing object). The disagreement appears to be that you believe for some reason it's a good thing to silently discard this information so that nobody can make use of it, whereas I believe it would be beneficial to make it available for those who want to use it. Again, I feel compelled to point out that as far as I can tell, your entire objection on this count boils down to "unpacking must always be the same thing as iteration because that's the way I've always thought about it". Unpacking *currently* uses iteration under the covers because it is a convenient interface that already exists in the language, but there is absolutely no reason why unpacking *must* inherently be defined as the same thing as iteration. You talk about it as if this is a foregone conclusion, but as far as I can tell it's only foregone because you've already arbitrarily decided it to be one way and just won't listen to anybody suggesting anything else. Alternately, if you just can't manage to get past this "unpacking must mean iteration" thing, then don't look at this as a change to unpacking. Look at it instead as an extension to the iterator protocol, which allows an iteration consumer to tell the iteration producer more information about the number of items they are wanting to obtain from it. Heck, I actually wouldn't be opposed to making this a general feature of iter(), if it could be done in a backwards-compatible way.. 2. general usefulness: you want .unpack to be standardized and made a

...

special method. I think it is inherently variable enough and and the need rare enough to not justify that.

I think I already spoke to this above.. Put simply, if it is not standardized and utilized by the corresponding language constructs, it is essentially useless as a general-purpose solution and only works in very limited cases. Your alternative solutions just aren't solutions to the more general problems. --Alex

Terry Reedy

12:57 a.m.

On 2/24/2013 1:24 AM, Guido van Rossum wrote:

...

I considered $ for $top, but I was sure you would not want to waste $ in this. In my original suggestion, I also suggested None as in a, b, c, None = source as another possibly to signal "I want source iteration to stop even if there are more items, and I don't want an exception raised even if there are, because I think I know what I am doing and I do not want it to be considered a bug." Any other short spelling/signal would be fine, too. -- Terry Jan Reedy

Larry Hastings

4:06 a.m.

On 02/24/2013 04:57 PM, Terry Reedy wrote:

...

If we are seriously considering this language addition, may I counter-propose the syntax a, b *= iterable because star already has something vaguely to do with unpacking, something something. *handwave* //arry/

Jan Kaliszewski

12:45 p.m.

24.02.2013 07:24, Guido van Rossum wrote: [...]

...

And to me the ... Looks too much like something that consumes the rest, rather than leaving it.

To me either. And we already have the syntax for consuming the rest: a, b, *seq = iterable But maybe it could be extended to include the following variant: a, b, *() = iterable -- expressing the "leave the rest untouched" behaviour? Doesn't such an empty-tuple-literal-like syntax suggest strongly enough that no items are consumed? Cheers. *j

Devin Jeanpierre

4:20 p.m.

On Mon, Feb 25, 2013 at 7:45 AM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote:

...

I would've interpreted it as unpacking the rest of the iterable into () -- i.e., I'd assume it has the current behavior of failing if the rest of the iterable has anything at all. Of course, you can't unpack anything into (), because Python never had that syntax, but you get the idea. -- Devin

Greg Ewing

11:30 p.m.

Devin Jeanpierre wrote:

...

-1, this is just as arbitrary as ... or a lone *. I prefer ... myself, because visually it has a low profile and doesn't draw undue attention to something that you're saying you don't care about. Maybe ... could be blessed as an official "don't care" assignment target, like _ in Haskell. That would make this usage less of a special case (although it would still be partially special, since it suppresses unpacking of the rest of the iterator). -- Greg

Steven D'Aprano

12:07 a.m.

On 26/02/13 10:30, Greg Ewing wrote:

...

Please no. _ is already overloaded too much. In i18n contexts, _ is conventionally used as the function for getting display text. In the interactive interpreter, _ is used for the last result. And in source code, _ is often used by convention as a "don't care" variable. Turning _ into a magic "stop unpacking" symbol would break code that uses it as a "don't care" target when unpacking: spam, _, _, ham, eggs = stuff and would be ambiguous when there is an underscore as the right-most target. Would it mean, unpack but I don't care about it, or don't unpack? I'm still not seeing why this is important enough to add magic syntax. If you want to unpack the first five values from an iterator, without exhausting it, we already have some good solutions: spam, ham, eggs, cheese, tomato = (next(it) for i in range(5)) or spam, ham, eggs, cheese, tomato = itertools.islice(it, 5) Neither of these are so confusing to read or onerous to use as to justify new magic syntax: spam, ham, eggs, cheese, tomato, $$$$ = it # for some value of $$$ -- Steven

MRAB

12:19 a.m.

On 2013-02-26 00:07, Steven D'Aprano wrote:

...

[snip] He didn't say that we should use "_", he said that "..." would be like "_" in Haskell.

Steven D'Aprano

4:03 a.m.

On 26/02/13 11:19, MRAB wrote:

...

/facepalm And so he did. Sorry about that Greg! But still, I think this does help demonstrate that symbols are not always the easiest to read at a glance. It's too easy for the eyes to slide right off them, unless they are as familiar as basic arithmetic. -- Steven

João Bernardo

4:29 p.m.

...

But maybe it could be extended to include the following variant:

a, b, *() = iterable

Python already supports this odd syntax a, b, *[] = iterable because it interprets the [] not as an empty list, but as an empty "list of identifiers". Maybe it could be used for something useful. BTW, the del syntax has the same "problem" del a, b, (c,), [d], []

Greg Ewing

11:37 p.m.

João Bernardo wrote:

...

Python already supports this odd syntax

a, b, *[] = iterable

because it interprets the [] not as an empty list, but as an empty "list of identifiers". Maybe it could be used for something useful.

No, because it already has a meaning: there must be no more values left in the sequence.

...

BTW, the del syntax has the same "problem"

del a, b, (c,), [d], []

Or just [] = iterable The surprising thing is that a special case seems to be made for ():

...

...
...
() = [] File "<stdin>", line 1 SyntaxError: can't assign to ()

It's surprising because () and [] are otherwise completely interchangeable for unpacking purposes. -- Greg

Jan Kaliszewski

12:13 a.m.

26.02.2013 00:37, Greg Ewing wrote:

...

Indeed, I didn't know that. :-|

...

Not entirely... >>> a, b, [c, d] = 1, 2, (3, 4) >>> a, b, (c, d) = 1, 2, (3, 4) OK. >>> a, b, *[c, d] = 1, 2, 3, 4 >>> a, b, *(c, d) = 1, 2, 3, 4 OK as well -- *but*: >>> a, b, [] = 1, 2, [] # it's ok >>> a, b, () = 1, 2, () # but it's not File "<stdin>", line 1 SyntaxError: can't assign to () ...and: >>> a, b, *[] = 1, 2 # it's ok >>> a, b, *() = 1, 2 # but it's not File "<stdin>", line 1 SyntaxError: can't assign to () Strange... (inb4: http://www.youtube.com/watch?v=5IgB8XKR23c#t=118s ). *j

João Bernardo

1:20 a.m.

2013/2/25 Greg Ewing <greg.ewing@canterbury.ac.nz>

...

João Bernardo wrote:

...
Python already supports this odd syntax

a, b, *[] = iterable

because it interprets the [] not as an empty list, but as an empty "list of identifiers". Maybe it could be used for something useful.

No, because it already has a meaning: there must be no more values left in the sequence.

Why have two things with the same meaning? a, b = iterable a, b, *[] = iterable Both are the same... The *[] thing is about 100% useless right now. And the empty "list" syntax is so not used it was possible to segfault pypy<https://bugs.pypy.org/issue1364> by putting that in a for loop.

...

BTW, the del syntax has the same "problem"

...
del a, b, (c,), [d], []

Or just

...

[] = iterable

The surprising thing is that a special case seems to be made for ():

...
...
...
() = [] File "<stdin>", line 1 SyntaxError: can't assign to ()

It's surprising because () and [] are otherwise completely interchangeable for unpacking purposes

That's what I'm saying... [] and () during unpaking aren't the same thing when they normally should... The same happens with the "del" syntax! One can think about this to see if an iterable is empty and raise an error if it is not: [] = iterable But *[], = iterable Has the exact same meaning and is very cryptic. BTW: This is the code golf winner to make a list of chars: *s, = 'string'

Nick Coghlan

3:50 a.m.

On Tue, Feb 26, 2013 at 11:20 AM, João Bernardo <jbvsmo@gmail.com> wrote:

...

And almost certainly an unintentional quirk of the way the Grammar is constructed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

MRAB

3:01 a.m.

On 2013-02-24 00:53, Terry Reedy wrote: [snip]

...

[snip] ,** reminds me too much of dict packing/unpacking. -1 ,None looks odd because it looks like you're binding to None. -1 ,... looks good to me. +1

Random832

3:09 a.m.

On 02/22/2013 02:33 PM, Alex Stewart wrote:

...

If you want "smart" unpackable objects, why not allow d in your *d example to be something other than a list? Like, if you have a,*b = range(1,5); b could literally be range(2,5).

Alex Stewart

9:52 p.m.

On Sunday, February 24, 2013 7:09:28 PM UTC-8, Random832 wrote:

...

Well, primarily for two reasons: 1. It does not actually do anything to support smart unpackables (i.e. objects that want to be able to produce different values depending on how many items they're asked to produce). This might be useful for the "infinite iterator" case (and I think somebody proposed it for that earlier), but that's a completely different issue than smart-unpacking.. 2. More importantly, it would almost certainly break some existing code that relies on the fact that the contents of the extended-unpacking-term will always be a list, which would be bad. --Alex

Terry Reedy

February 2013

9:14 a.m.

On 2/22/2013 2:33 PM, Alex Stewart wrote:

...

Chris Angelico

10:18 a.m.

On Sat, Feb 23, 2013 at 8:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...

What if it were done explicitly, though? Currently:

...

ChrisA

Nick Coghlan

11:58 a.m.

On Sat, Feb 23, 2013 at 8:18 PM, Chris Angelico <rosuav@gmail.com> wrote:

...

...
...
a,b,c,*__ =range(4)

Masklinn

6:19 p.m.

On 2013-02-23, at 12:58 , Nick Coghlan wrote:

...

Then again, *args in (Python 3) unpacking is already completely incompatible with *args in functions. That does not strike me as much of an issue.

...

I'm +1 on these grounds, while the current unpacking "works" it makes for odd limitations or corner-cases.

...

Jan Kaliszewski

6:41 p.m.

On 23.02.2013 19:19, Masklinn wrote:

...

Eric Snow

8:10 p.m.

On Sat, Feb 23, 2013 at 4:58 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

This is the big one for me. And as masklinn pointed, it works the other way too (keep an iterable from unpacking).

...

Cool, though it would be complicated when the star target isn't the last target.

...

Agreed. Also, I expect that __unpack__ would be an opt-in API that falls back to iteration.

...

This should be handled/proposed separately, though it is dependent on an unpacking protocol.

...

Terry Reedy

February 2013

6:53 p.m.

On 2/23/2013 6:58 AM, Nick Coghlan wrote:

...

On Sat, Feb 23, 2013 at 8:18 PM, Chris Angelico <rosuav@gmail.com> wrote:

...
On Sat, Feb 23, 2013 at 8:14 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...
It would be much easier, and have much the same effect, if unpacking simply requested the minumum number of items and stopped raising a ValueError if the iterator has more items. No new protocol is needed. Guido rejected this as silently masking errors.

What if it were done explicitly, though?

Currently:

...
...
...
a,b,c=range(3) a,b,c=range(4) Traceback (most recent call last): File "<pyshell#55>", line 1, in <module> a,b,c=range(4) ValueError: too many values to unpack (expected 3) a,b,c,*d=range(6) a,b,c,* =range(4) SyntaxError: invalid syntax

Suppose the last notation were interpreted as "request values for a, b, and c, and then ignore the rest".

...

It highly unlikely it will ever be interpreted that way, because it contradicts the way the unnamed star is used in function headers (that is, to indicate the start of the keyword only arguments *without* accepting an arbitrary number of positional arguments). If you want to ignore excess values in unpacking, a double-underscore is the best currently available option (even though it has the downside of not working well with infinite iterators or large sequences):

...

...
...
...
a,b,c,*__ =range(4)

...

However, Alex's original idea of an "unpack protocol" (distinct from, but falling back to, the ordinary iterable protocol) definitely has at least a few points in its favour.

I strongly disagree as I think the two major points are wrong.

...

1. Iterating over heterogeneous types doesn't really make sense, but unpacking them does. A separate protocol lets a type support unpacking assignment without supporting normal iteration.

This, repeated in this

...

- this change increases the complexity of the language, by explicitly separating the concept of heterogeneous unpacking from homogeneous iteration. Those *are* two distinct concepts though, so the increased complexity may be justifiable.

...

3. As Alex noted, if a function that previously returned a 2-tuple wants to start returning a 3-tuple, that's currently a backwards incompatible change because it will break unpacking assignment. With an unpack protocol, such a function can return an object that behaves like a 3-tuple most of the time, but also transparently supports unpacking assignment to two targets rather than only supporting three.

...

2. The protocol could potentially be designed to allow an *iterable* to be assigned to the star target rather than requiring it to be unpacked into a tuple. This could be used not only to make unpacking assignment safe to use with infinite iterators, but also to make it cheaper with large sequences (through appropriate use of itertools.islice in a custom container's __unpack__ implementation) and with virtual sequences like range() objects.

Chris Angelico

7:36 p.m.

On Sun, Feb 24, 2013 at 11:53 AM, Terry Reedy <tjreedy@udel.edu> wrote:

...

Nick Coghlan

8:12 p.m.

Nick Coghlan

11:40 p.m.

On Sun, Feb 24, 2013 at 12:12 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:

...

Guido van Rossum

12:24 a.m.

...

-- --Guido van Rossum (python.org/~guido)

Nick Coghlan

1:38 a.m.

On Sun, Feb 24, 2013 at 4:24 PM, Guido van Rossum <guido@python.org> wrote:

...

And to me the ... Looks too much like something that consumes the rest, rather than leaving it.

Larry Hastings

February 2013

11:25 a.m.

On 02/23/2013 11:38 PM, Nick Coghlan wrote:

...

Chris Angelico

12:59 p.m.

On Sun, Feb 24, 2013 at 10:25 PM, Larry Hastings <larry@hastings.org> wrote:

...

Or

command, subcommand = next(iterargs), next(iterargs)

Steven D'Aprano

2:16 p.m.

On 24/02/13 23:59, Chris Angelico wrote:

...

It's a language guarantee. http://docs.python.org/2/reference/expressions.html#evaluation-order -- Steven

Chris Angelico

2:52 p.m.

On Mon, Feb 25, 2013 at 1:16 AM, Steven D'Aprano <steve@pearwood.info> wrote:

...

Vito De Tullio

3:45 p.m.

Chris Angelico wrote:

...

...
...
...
command, subcommand = next(iterargs), next(iterargs)

...

Guido van Rossum

4:07 p.m.

DRY is not the only principle to consider here. On Sunday, February 24, 2013, Chris Angelico wrote:

...

-- --Guido van Rossum (python.org/~guido)

Wolfgang Maier

February 2013

10:32 a.m.

Nick Coghlan <ncoghlan@...> writes:

...

1 2 3 4 5 6 7 8 9 10 Maybe one could improve itertools.islice accordingly??

...

Bruce Leban

8:04 a.m.

On Sat, Feb 23, 2013 at 10:24 PM, Guido van Rossum <guido@python.org> wrote:

...

And hypothetically a, b, c = next(iterargs, count=3) or a, b, c = (next * 3)(iterargs) --- Bruce

Steven D'Aprano

9:08 a.m.

On 24/02/13 19:04, Bruce Leban wrote:

...

That looks like, well, I'm not sure what it looks like. Haskell maybe? But not Python. -- Steven

Andrew Barnert

2:56 a.m.

On Feb 24, 2013, at 1:08, Steven D'Aprano <steve@pearwood.info> wrote:

...

Oscar Benjamin

4:42 p.m.

On 24 February 2013 06:24, Guido van Rossum <guido@python.org> wrote:

...

Jan Kaliszewski

8:14 p.m.

24.02.2013 17:42, Oscar Benjamin wrote:

...

Not necessarily. You can use: command = next(iterator, None) Cheers. *j

MRAB

February 2013

8:34 p.m.

On 2013-02-24 20:14, Jan Kaliszewski wrote:

...

That's not the same thing. Normally when unpacking, it raises ValueError if there are too few items. That's what we want to preserve.

Alex Stewart

10:07 p.m.

Terry Reedy

3:43 a.m.

On 2/24/2013 5:07 PM, Alex Stewart wrote:

...

==== Unpacking protocol

First, there is the question I originally raised about separating the "unpacking protocol" from the "iterator protocol". I would like to clarify a couple of points on this:

I proposed __unpack__ the way I did specifically for a few reasons:

1. It is a fairly small change to the language

...

2. There are (in my opinion) already pretty good parallels with this sort of change that have already been implemented in Python (for example, separating the concept of "iterable" from the concept of "indexable").

...

3. It is consistent with the general idea in Python that an object can choose to emulate whichever behaviors of other objects make sense, and not emulate other ones that do not make sense. In my opinion, "iterating", and "unpacking" are two related-but-not-the-same language concepts, which currently are unnecessarily entangled.

...

4. It would help significantly with several types of problems that do not currently have great solutions.

I obviously do not see the same pile of problems that would justify a new special method. ...

...

no way to query an iterator to find out if it has more data available

See how in my separate post: "Add lookahead iterator (peeker) to itertools" -- Terry Jan Reedy

Alex Stewart

10:25 p.m.

On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote:

...

On 2/24/2013 5:07 PM, Alex Stewart wrote:

...
1. It is a fairly small change to the language

To me, taking something simple and elegant and complexifying it, to little benefit I can see other than to solve a problem in the wrong place, in no small matter.

...

some category of objects has not effect unless changes are made to the interpreter elsewhere to make automatic use of that method. Otherwise, you could just define a normal method to be used by user code.

...

2. There are (in my opinion) already pretty good parallels with this

...
sort of change that have already been implemented in Python (for example, separating the concept of "iterable" from the concept of "indexable").

The related but distinct concepts of sequential access and random access are basic to computation and were *always* distinct in Python.

...

added more benefits than __next__ alone would have.

...

3. It is consistent with the general idea in Python that an object can

...
choose to emulate whichever behaviors of other objects make sense, and not emulate other ones that do not make sense. In my opinion, "iterating", and "unpacking" are two related-but-not-the-same language concepts, which currently are unnecessarily entangled.

I do not see that at all. As near as I can tell, both mean 'sequential access'.

...

vague and incomplete to me. But until I see your conceptual distinction, the details do not matter to me.

...

4. It would help significantly with several types of problems that do

...
not currently have great solutions.

I obviously do not see the same pile of problems that would justify a new special method.

...

no way to query an iterator to find out if it has more data available

See how in my separate post:

...

"Add lookahead iterator (peeker) to itertools"

Terry Reedy

12:44 a.m.

...

On Sunday, February 24, 2013 7:43:58 PM UTC-8, Terry Reedy wrote: The related but distinct concepts of sequential access and random access are basic to computation and were *always* distinct in Python.

Oh really?

Yes really!

...

Then how, prior to the development of the iterator protocol, did one define an object which was accessible sequentially but not randomly in the Python language?

...

...
...
for item in lookahead('abc'): print(item)

a b c

...

...
...
dir(lookahead) ['_NONE', '__bool__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__gt__', '__hash__', '__init__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__qualname__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_set_peek']

See, no __iter__, no __next__, but a __getitem__ that ignores the index passed. Here is where it gets a bit crazy.

...

...
...
s = lookahead('abc') s[0] 'a' s[0], s[300] ('b', 'c') s[0] Traceback... StopIteration

...

...
...
s = 'abc' for c in lookahead(s): for d in lookahead(s): print((c,d))

('a', 'a') ('a', 'b') ('a', 'c') ('b', 'a') ('b', 'b') ('b', 'c') ('c', 'a') ('c', 'b') ('c', 'c')

...

If you couldn't do that, you can't claim that the concepts were really distinct in Python, in my opinion.

...

...
...
0 1 2 count(3) 3 4

Alex Stewart

8:58 p.m.

On Tue, Feb 26, 2013 at 4:44 PM, Terry Reedy <tjreedy@udel.edu> wrote:

...

On 2/25/2013 5:25 PM, Alex Stewart wrote:

Negative putdowns are off-topic and not persuasive.

...

I starting with 1.3 in March 1997 and first posted a month later.

...

Then how, prior to the development of the iterator protocol,

did one define an object which was accessible sequentially but not

...
randomly in the Python language?

As I said, by using the original fake-getitem iterator protocol, which still works, instead of the newer iter-next iterator protocol.

...

expected to be requested.

Case 1. The desired response of the source to n is generic.

...

Case 2. The desired response is specific to a class or even each instance.

[...] Solution: write a method, call it .unpack(n), that returns an iterator that

...

will produce the objects specified in the table. This can be done today with no change to Python. It can be done whether or not there is a .__iter__ method to produce a generic default iterator for the object. And, of course, xxx.unpack can have whatever signature is appropriate to xxx. It seems to me that this procedure can handle any special collection or structure breakup need.

...

count-dependent disassembly or that they should be able to do so. It can already be done.

...

Areas of disagreement:

1. consumer-source interdependency: you seem to think there is something special about the consumer assigning items to multiple targets in one statement, as opposed to doing anything else, including doing the multiple assignments in multiple statements.

...

special method. I think it is inherently variable enough and and the need rare enough to not justify that.

Terry Reedy

February 2013

12:57 a.m.

On 2/24/2013 1:24 AM, Guido van Rossum wrote:

...

Larry Hastings

4:06 a.m.

On 02/24/2013 04:57 PM, Terry Reedy wrote:

...

Jan Kaliszewski

12:45 p.m.

24.02.2013 07:24, Guido van Rossum wrote: [...]

...

And to me the ... Looks too much like something that consumes the rest, rather than leaving it.

Devin Jeanpierre

4:20 p.m.

On Mon, Feb 25, 2013 at 7:45 AM, Jan Kaliszewski <zuo@chopin.edu.pl> wrote:

...

Greg Ewing

11:30 p.m.

Devin Jeanpierre wrote:

...

Steven D'Aprano

12:07 a.m.

On 26/02/13 10:30, Greg Ewing wrote:

...

MRAB

February 2013

12:19 a.m.

On 2013-02-26 00:07, Steven D'Aprano wrote:

...

[snip] He didn't say that we should use "_", he said that "..." would be like "_" in Haskell.

Steven D'Aprano

4:03 a.m.

On 26/02/13 11:19, MRAB wrote:

...

João Bernardo

4:29 p.m.

...

But maybe it could be extended to include the following variant:

a, b, *() = iterable

Greg Ewing

11:37 p.m.

João Bernardo wrote:

...

Python already supports this odd syntax

a, b, *[] = iterable

because it interprets the [] not as an empty list, but as an empty "list of identifiers". Maybe it could be used for something useful.

No, because it already has a meaning: there must be no more values left in the sequence.

...

BTW, the del syntax has the same "problem"

del a, b, (c,), [d], []

Or just [] = iterable The surprising thing is that a special case seems to be made for ():

...

...
...
() = [] File "<stdin>", line 1 SyntaxError: can't assign to ()

It's surprising because () and [] are otherwise completely interchangeable for unpacking purposes. -- Greg

Jan Kaliszewski

12:13 a.m.

26.02.2013 00:37, Greg Ewing wrote:

...

Indeed, I didn't know that. :-|

...

João Bernardo

1:20 a.m.

2013/2/25 Greg Ewing <greg.ewing@canterbury.ac.nz>

...

João Bernardo wrote:

...
Python already supports this odd syntax

a, b, *[] = iterable

because it interprets the [] not as an empty list, but as an empty "list of identifiers". Maybe it could be used for something useful.

No, because it already has a meaning: there must be no more values left in the sequence.

...

BTW, the del syntax has the same "problem"

...
del a, b, (c,), [d], []

Or just

...

[] = iterable

The surprising thing is that a special case seems to be made for ():

...
...
...
() = [] File "<stdin>", line 1 SyntaxError: can't assign to ()

It's surprising because () and [] are otherwise completely interchangeable for unpacking purposes

Nick Coghlan

February 2013

9:50 p.m.

On Tue, Feb 26, 2013 at 11:20 AM, João Bernardo <jbvsmo@gmail.com> wrote:

...

And almost certainly an unintentional quirk of the way the Grammar is constructed. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

MRAB

9:01 p.m.

On 2013-02-24 00:53, Terry Reedy wrote: [snip]

...

[snip] ,** reminds me too much of dict packing/unpacking. -1 ,None looks odd because it looks like you're binding to None. -1 ,... looks good to me. +1

Random832

9:09 p.m.

On 02/22/2013 02:33 PM, Alex Stewart wrote:

...

If you want "smart" unpackable objects, why not allow d in your *d example to be something other than a list? Like, if you have a,*b = range(1,5); b could literally be range(2,5).

Alex Stewart

3:52 p.m.

On Sunday, February 24, 2013 7:09:28 PM UTC-8, Random832 wrote:

...

4403

Age (days ago)

4408

Last active (days ago)

List overview

Download

46 comments

20 participants

participants (20)

Alex Stewart
Andrew Barnert
Bruce Leban
Chris Angelico
Devin Jeanpierre
Eric Snow
Greg Ewing
Guido van Rossum
Jan Kaliszewski
João Bernardo
Larry Hastings
Masklinn
MRAB
Nick Coghlan
Oscar Benjamin
Random832
Steven D'Aprano
Terry Reedy
Vito De Tullio
Wolfgang Maier

iterable.__unpack__ method

Jan Kaliszewski

Jan Kaliszewski

Jan Kaliszewski

Jan Kaliszewski

Random832

Jan Kaliszewski

Jan Kaliszewski

Jan Kaliszewski

Jan Kaliszewski

Random832

tags

participants (20)

iterable.unpack method