syntactic shortcut - unpack to variably sized list

Hi As far as I can tell from the archive, this has not been discussed before. This is the second time in less than a week that I have stumbled over the rather clumsy syntax of extracting some elements of a sequence and at the same time remove those from the sequence:
L = 'a b 1 2 3'.split(' ') a,b,L = L[0], L[1], L[2:]
I think it would be nice if the following was legal:
Today, if the number of variables on both sides of the equal sign doesn't match, an exception is raised (for google reference): ValueError: unpack list of wrong size This new syntax is very similar to the special parameter in function definitions that catches all excess arguments, as in def func(p, --> *args <--, **kw):, and so the semantics should be what everyone expects. Then, if we leave this limiting analogy with the *args parameter and allow the catch-the-rest variable to be anywhere in the left-hand side, we arrive at a splice syntax that reminds me a little of how prolog deals with lists:
I believe this would make a nice addition to the language. (Please cc me in response as I'm not subscribed to py-dev.) ...johahn

On Nov 11, 2004, at 4:50 PM, Johan Hahn wrote:
Oh, this has been discussed before. It's come up quite a number of times, in fact. But, it's been rejected every time. FWIW [which is very little ;)], I agree: this would be nice. I'm strongly for consistency and orthogonality: there should be few distinct concepts in the language each of which works in many places. Allowing * only in an function call but not in other tuple-constructing situations violates this concept, IMO. It would have been great if this had been added at the same time as * in function calls. Adding it now may be more trouble than it's worth, though. Two threads on the topic, with rejections by Guido: http://mail.python.org/pipermail/python-dev/2002-November/030349.html http://mail.python.org/pipermail/python-dev/2004-August/046684.html Guido's rejection from http://mail.python.org/pipermail/python-dev/2004-August/046794.html:
James

Johan Hahn wrote:
As James says, this has been discussed before. Typically, you don't need the rest list, so you write a,b = 'a b 1 2 3'.split(' ')[:2] If you do need the rest list, it might be best to write L = 'a b 1 2 3'.split(' ') a = L.pop(0) b = L.pop(0) Whether this is more efficient than creating a new list depends on the size of the list; so you might want to write L = 'a b 1 2 3'.split(' ') a = L[0] b = L[1] del L[:2] This is entirely different from functions calls, which you simply cannot spread over several statements. Regards, Martin

Johan Hahn wrote:
Hmm - I just had a thought about this. Is it worth adding a "len" argument to list.pop? (The idea was inspired by Martin's use of list.pop to handle the above case). With that change, the above example would become: a, b = L.pop(0, 2) At the moment, list.pop is described as equivalent to: x = L[i]; del L[i]; return x with this change, it would be: x = L[i:i+n]; del L[i:i+n]; return x By default, n = 1, so the standard behaviour of list.pop is preserved. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
x = L[i:i+n]; del L[i:i+n]; return x
By default, n = 1, so the standard behaviour of list.pop is preserved.
This default would actually change the standard behaviour: whereas it now returns a single element, it would then return a list containing the single element. Regards, Martin

Martin v. Löwis wrote:
Ah, good point. I'd see two possible fixes to that: a) have n=0 be the default, and mean 'give me the element, not a list with 1 element" (L.pop(0,1) would then mean "give me a list containing only the first element"). That's a little magical for my taste, though. b) have a new method called 'extract' or 'poplist' or 'popslice' or similar (with the behaviour given above) Actually, if we went with the b) option, and the name 'popslice', I would suggest the following signature: list.popslice(start=0, end, step=1) i.e. L.popslice(a, b, c) is to L[a:b:c] as L.pop(a) is to L[a] And, returning once again to the OP's example, we would have: a, b = list.popslice(2) Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
Heh - turns out there is an actual slice() constructor with that start/end/step signature (I've never used it, so I didn't even know it was there). Anyway, *if* we do anything about this (cleaner unpacking & removal of part of a list), I think 'popslice' or something similar is the right way to address it (rather than monkeying with syntax, which would only address slices at the start of the list anyway). Any of the "a, b, *c = L" proponents want to come up with a candidate 'list.popslice' patch for 2.5? The implementation shouldn't be too hard - the "x = L[a:b:c]; del L[a:b:c]; return x" description should suffice for an implementation, too (using the C API, rather than actual Python, though). Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Hello Nick, On Sat, Nov 13, 2004 at 10:05:23PM +1000, Nick Coghlan wrote:
At the moment, list.pop is described as equivalent to:
x = L[i]; del L[i]; return x
Then the documentation is not exact: if it were really equivalent to the above definition, then we could already do today: a, b = L.pop(slice(0,2)) because starting from Python 2.3, L[slice(0,2)] is equivalent to L[0:2] So who's wrong: the documentation of list.pop() or its implementation? It would be nicely regular if we could pop slices. Armin

Armin Rigo wrote:
So who's wrong: the documentation of list.pop() or its implementation? It would be nicely regular if we could pop slices.
That's an interesting observation. I was somewhat concerned that this kind of overloading might be confusion, but given that it is even already documented (with nobody noticing), I'm now very much in favour of allowing to pop slices. Of course, whoever implements the change should make sure that slices are applicable consistently everwhere the documentation says they should be. Regards, Martin

Martin v. Löwis wrote:
Random thought: the two uses of "[" and "]" in the sequence docs can get a little confusing. . . or maybe I'm just tired ;) Anyway, the sequence and mutable sequence sections of the documentation don't reveal anything other than list.pop(). It seems to be the only normal method that accepts an index as an argument. Everything else looks to be dealt with via standard subscripts and the associated magic methods: Retrieval: x = L[i] | x = L.__getitem__(i) Setting: L[i] = x | L.__setitem__(i, x) Deletion: del L[i] | L.__delitem__(i, x) So it looks like list.pop simply got left out because it wasn't a magic method, and the idea of permitting extended slicing for it just didn't come up (well, until now).** Anyway, if this is implemented, array.pop and UserList.pop should allow slices, too, as they are described as working like list.pop. The other container classes I've checked are either immutable, or don't allow indexed access beyond the basics (string, unicode, set, deque). Cheers, Nick. ** Some items of possible historical interest: http://www.python.org/doc/2.3.4/whatsnew/section-slices.html http://sourceforge.net/tracker/?group_id=5470&atid=305470&aid=400998&func=detail http://mail.python.org/pipermail/python-dev/2002-May/023874.html -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
This isn't really true: s.index(x[, i[, j]]) return smallest k such that s[k] == x and i <= k < j s.insert(i, x) same as s[i:i] = [x] However, I don't think this naturally extends to slices: for index, there is no <= relationship for slices (atleast not a natural one), and for insert, you can use slices as start- and end-index of a slice.
Anyway, if this is implemented, array.pop and UserList.pop should allow slices, too, as they are described as working like list.pop.
Right. For UserList, it probably falls out naturally. Regards, Martin

Martin v. Löwis wrote:
Yes. Those two were on my list initially, but then I tried to figure out how using a slice would actually *work* for them. At which point, I took them back off the list - slice arguments just didn't make any sense. So I think we're down to two things to implement - list.pop and array.pop. As you say, UserList.pop should just be a different way of spelling list.pop. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Thu, 11 Nov 2004 22:50:21 +0100, Johan Hahn <johahn@home.se> wrote:
I am really late on this thread, but anyway, I've come up with another approach to solve the problem using iterators. It uses iterator that is guaranteed to always return a fixed number of elements, regardless of the size of the sequence; when it finishes, it returns the tail of the sequence as the last argument. This is a simple-minded proof of concept, and it's surely highly optimizable in at least a hundred different ways :-) def iunpack(seq, times, defaultitem=None): for i in range(times): if i < len(seq): yield seq[i] else: yield defaultitem if i < len(seq): yield seq[i+1:] else: yield () Usage is as follows:
As it is, it fails if the requested number of elements is zero, but this is not a real use case for it anyway. But the best part is yet to come. Because of the way Python implicitly packs & unpacks tuples, you can use it *without* calling tuple():
The only catch is that, if you have only one parameter, then all you will get is the generator itself.
But that's a corner case, and not the intended use anyway. Besides that, there is an issue regarding the 'times' parameter; whether it should return 'times' items plus the tail part, or 'times-1' items and the tail part. I think that it's fine the way it is. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
So the original example becomes: a, b, L = itertools.iunpack(L, 2) (and the 2 is an element count, not an index, so, as you say, there's no need to subtract 1) That's certainly quite tidy, but it has the downside that it still copies the entire list as happens in the OP's code (the copying is hidden, but it still happens - the original list isn't destroyed until the assignment of the third value returned by the iterator). It's also likely to require a trip to the docs to find out how iunpack works, and it doesn't fare well in the 'discovery' category (that is, the solution to what should be a fairly basic list operation is tucked away in itertools) If list.pop gets updated to handle slice objects, then it can modify the list in place, and avoid any copying of list elements. "a, b = L.pop(slice(2)" should be able to give even better performance than "a = L.pop(0); b = L.pop(0)" (which is, I believe, the only current approach that avoids copying the entire list). And the list operation stays where it belongs - in a method of list. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Thu, 18 Nov 2004 21:07:34 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
list.pop doesn't solve the case for when the data is stored in a tuple (which is immutable). Also, a C implementation (of the type that would be done for itertools inclusion) would certainly solve some of the performance issues. As far as the documentation being tucked away into itertools, I don't see that as a problem; in my opinion, itertools already holds a good deal of utility functions which deserve better study by novices (on a near par with builtins). And finally, I believe that the name iunpack is already quite obvious on what it means. BTW, just because map, zip, enumerate, filter & etc are in builtins doesn't mean that they are inherently more useful, or pythonic, than the functions in itertools. It's just a matter that itertools is a relatively late addition to the pack; I hope it becomes more proeminently adopted as people get used to them. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
list.pop doesn't solve the case for when the data is stored in a tuple (which is immutable).
For immutable objects, you *have* to make a copy, so I don't see any real downside to just using: a, b, T = T[0], T[1], T[2:] While I think iunpack is kind of neat, I don't think it solves anything which is currently a major problem, as it is really just a different way of spelling the above slicing. The major portion (sans some index checking) of iunpack(T, 2) can be written on one line: a, b, T = (T[i] for i in (range(2) + [slice(2, None)])) When the number of elements to be unpacked is known at compile time (as it has to be to use tuple unpacking assignment), there seems little benefit in moving things inside a generator instead of spelling them out as a tuple of slices. However, the OP's original question related to a list, where the goal was to both read the first couple of elements *and* remove them from the list (i.e. 'pop'ing them). This is similar to the motivating use cases which came up the last time the "a, b, *c = L" idea was raised. At the moment, the main choices are to use the syntax above (and make an unnecessary copy), make multiple calls to list.pop (with associated function overhead), or call del directly after reading the slice you want (i.e. doing a list.pop of a slice in Python code). According to the current docs, L.pop(i) is defined as being equivalent to: x = L[i]; del L[i]; return x And Armin pointed out that setting "i = slice(<whatever>)" actually causes that stated equivalence to break:
So, we could either fix the docs to explicitly exclude slices from list.pop, or just fix the method so slices work as the argument. The latter is a little more work, but it does provides a nice way of spelling "read and remove" for a range of elements. Which is why I'm hoping one of the proponents of "a, b, *c = L" will be willing to provide an implementation. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Fri, 19 Nov 2004 20:29:45 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
(First of all, let me say that I am not at all against making list.pop accept slices. It's nice that it seems to be agreed that this is a worthy addition. And I don't intend to keep pushing for iunpack(); but I feel that is important to make a few clarifications.) If I want to do: a,b,X = T[0], T[1], T[2:] (?) The lesson here is that it can't be assumed for unpacking purposes that the programmer wanted to modify the original list. Of course, if he does want it, then list.pop just comes nice. (also, list.pop(slice) can be used to remove elements from anywhere on the list, which is a big plus)
I would rather prefer not to have to use this idiom -- it's neither obvious nor convenient. Either list.pop or unpack would be cleaner for this purpose.
For more than a few arguments, it seems to be silly to require the user to write it as: a,b,c,d,e = t[0],t[1],t[2],t[3],t[4:] [lots of remarks about list.pop snipped] I'm not against list.pop; it's just that iunpack provides a different approach to the problem. It's fairly generic: it works for both mutable and immutable lists. The implementation provided is a proof of concept. The fact that it does not modify the original arguments *is* a design feature (althought not what the OP wanted for his particular case). After all, as shown in the example above, conventional tuple unpacking on assignment doesn't change the right-side arguments either. One posible improvement for iunpack() is to accept any iterable as an argument. On return, instead of a tuple to represent the remaining elements, it would return the generator itself. This would allow for code that consume the generator in "chunks" to be written rather simply: a,b,c,g = iunpack(3, g) In this case, to solve the OP problem, instead to have all the values to be processed stored into a list, he could rely on a generator to produce values on the fly. Depending on the scenario this would come handy (less memory consumption, more responsiveness). -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

"Carlos Ribeiro" <carribeiro@gmail.com> wrote in message news:864d37090411190724440dfe38@mail.gmail.com...
and also impossible.
but this does require that return values be captured in a temporary first. This issue has come up enough that a PEP would be useful (if there isn't one already). Terry J. Reedy

Terry Reedy wrote:
Hmm. . . _>>>t = range(10) _>>>a, b, c, d, e = t[:4] + [t[4:]] _>>>a, b, c, d, e _(0, 1, 2, 3, [4, 5, 6, 7, 8, 9]) So this actually can scale (sort of) to longer tuples. Anyway, I don't have any real objection to iunpack. If it was designed to work on any iterator (return the first few elements, then return the partially consumed iterator), I'd be +1 (since, as Carlos pointed out, slicing doesn't work for arbitrary iterators). Something like: _>>>def iunpack(itr, numitems, defaultitem=None): _... for i in range(numitems): _... try: _... yield itr.next() _... except StopIteration: _... yield defaultitem _... yield itr _>>> g = (x for x in range(10)) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(0, 1, 2, 3, <generator object at 0xa0cd02c>) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(4, 5, 6, 7, <generator object at 0xa0cd02c>) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(8, 9, None, None, <generator object at 0xa0cd02c>) Cheers, Nick. I think we're now to the stage of violently agreeing with each other. . . -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Terry Reedy wrote:
This issue has come up enough that a PEP would be useful (if there isn't one already).
Nope, I had a look. I've got a draft one attached, though - it combines rejecting the syntax change with adding the other changes Carlos and I have suggested. If people think it's worth pursuing, I'll fire it off to the PEP editors. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 PEP: XXX Title: Unpacking sequence elements using tuple assignment Version: $Revision: 1.4 $ Last-Modified: $Date: 2003/09/22 04:51:50 $ Author: Nick Coghlan <ncoghlan@email.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-Nov-2004 Python-Version: 2.5 Post-History: 20-Nov-2004 Abstract ======== This PEP addresses the repeated proposal that "a, b, *c = some_sequence" be permissible Python syntax. The PEP does NOT propose that this syntax be allowed. In fact, acceptance of this PEP will indicate agreement that the above syntax will NOT become part of Python. However, this PEP does suggest two additions that will hopefully reduce the demand for the above syntax: - modify ``list.pop`` and ``array.pop`` to also accept slice objects as the index argument - add an ``itertools.iunpack`` function that accepts an arbitrary iterator, yields a specified number of elements from it, and then yields the partially consumed iterator. Rationale ========= The proposal to extend the tuple unpacking assignment syntax has now been posted to python-dev on 3 separate occasions, and rejected on all 3 occasions.[1]_ The subsequent discussion on the most recent occasion yielded the two suggestions documented in this PEP (modifying ``list.pop`` was suggested by the PEP author, the ``itertools.iunpack`` function was suggested by Carlos Ribeiro). Modifying ``list.pop`` to accept slice objects as well as integers brings it in line with the standard sequence subscripting methods (``__getitem__``, ``__setitem__``, ``__delitem__``). It also makes ``list.pop`` consistent with its current documentation (the Python code given as equivalent to ``list.pop`` accepts slice objects, but ``list.pop`` does not). The modification of ``array.pop`` is mainly for consistency with the new behaviour of ``list.pop``. The ``itertools.iunpack`` function has the advantage of working for arbitrary iterators, not just lists or arrays. However, it always yields copies of the elements, while the ``pop`` methods are able to avoid unnecessary copying. Proposed Semantics ================== ``list.pop`` would be updated to conform to its documentation as being equivalent to:: x = L[i] del L[i] return x In Python 2.4, the above equivalence does not hold if ``i`` is a slice object. ``array.pop`` would be updated similarly. ``itertools.iunpack`` would be equivalent to the following:: def iunpack(itr, numitems, defaultitem=None): for i in range(numitems): try: yield itr.next() except StopIteration: yield defaultitem yield itr Reference Implementation ======================== As yet, no reference implementation is available for either part of the proposal. Open Issues =========== - Should ``itertools.iunpack`` call ``iter()`` on its first argument? References ========== .. [1] python-dev archives of tuple unpacking discussions (http://mail.python.org/pipermail/python-dev/2002-November/030349.html) (http://mail.python.org/pipermail/python-dev/2004-August/046684.html) (http://mail.python.org/pipermail/python-dev/2004-November/049895.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

On Sat, 20 Nov 2004 13:41:19 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
I may be wrong (and if that's the case, I would like to be politely educated on why), but isn't it already on a PEP-worthy point? IOW, if people are not interested, then the PEP will simply be rejected, which is a good thing, because it will at least document the case. I also believe that the pre-PEP should be posted to the main Python list, where it may be beaten to death & flamed by a larger audience. I am willing to do it myself, but I assume that is more polite on my part if I ask you to do it :-) <PEP boilerplate & rationale snipped>
I though the iunpack() code in the previous section would be acceptable as a 'reference implementation'.
+1
-- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
Good point - I've sent it to the PEP editors to request a number.
I'd wait to see what the PEP editors think, myself. I'm not entirely sure the idea is focused well enough to make a good PEP (I could argue either way, and I wrote the thing!). However, if you want to post it over there for discussion, feel free.
Unfortunately, itertools is a C module :P
We'll go with that then. . . y'know, I could have made that change before sending the PEP draft in. Ah well, we can change it later - I doubt the PEP will surive c.l.p. (or even py-dev) unscathed, anyway. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Nov 11, 2004, at 4:50 PM, Johan Hahn wrote:
Oh, this has been discussed before. It's come up quite a number of times, in fact. But, it's been rejected every time. FWIW [which is very little ;)], I agree: this would be nice. I'm strongly for consistency and orthogonality: there should be few distinct concepts in the language each of which works in many places. Allowing * only in an function call but not in other tuple-constructing situations violates this concept, IMO. It would have been great if this had been added at the same time as * in function calls. Adding it now may be more trouble than it's worth, though. Two threads on the topic, with rejections by Guido: http://mail.python.org/pipermail/python-dev/2002-November/030349.html http://mail.python.org/pipermail/python-dev/2004-August/046684.html Guido's rejection from http://mail.python.org/pipermail/python-dev/2004-August/046794.html:
James

Johan Hahn wrote:
As James says, this has been discussed before. Typically, you don't need the rest list, so you write a,b = 'a b 1 2 3'.split(' ')[:2] If you do need the rest list, it might be best to write L = 'a b 1 2 3'.split(' ') a = L.pop(0) b = L.pop(0) Whether this is more efficient than creating a new list depends on the size of the list; so you might want to write L = 'a b 1 2 3'.split(' ') a = L[0] b = L[1] del L[:2] This is entirely different from functions calls, which you simply cannot spread over several statements. Regards, Martin

Johan Hahn wrote:
Hmm - I just had a thought about this. Is it worth adding a "len" argument to list.pop? (The idea was inspired by Martin's use of list.pop to handle the above case). With that change, the above example would become: a, b = L.pop(0, 2) At the moment, list.pop is described as equivalent to: x = L[i]; del L[i]; return x with this change, it would be: x = L[i:i+n]; del L[i:i+n]; return x By default, n = 1, so the standard behaviour of list.pop is preserved. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
x = L[i:i+n]; del L[i:i+n]; return x
By default, n = 1, so the standard behaviour of list.pop is preserved.
This default would actually change the standard behaviour: whereas it now returns a single element, it would then return a list containing the single element. Regards, Martin

Martin v. Löwis wrote:
Ah, good point. I'd see two possible fixes to that: a) have n=0 be the default, and mean 'give me the element, not a list with 1 element" (L.pop(0,1) would then mean "give me a list containing only the first element"). That's a little magical for my taste, though. b) have a new method called 'extract' or 'poplist' or 'popslice' or similar (with the behaviour given above) Actually, if we went with the b) option, and the name 'popslice', I would suggest the following signature: list.popslice(start=0, end, step=1) i.e. L.popslice(a, b, c) is to L[a:b:c] as L.pop(a) is to L[a] And, returning once again to the OP's example, we would have: a, b = list.popslice(2) Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
Heh - turns out there is an actual slice() constructor with that start/end/step signature (I've never used it, so I didn't even know it was there). Anyway, *if* we do anything about this (cleaner unpacking & removal of part of a list), I think 'popslice' or something similar is the right way to address it (rather than monkeying with syntax, which would only address slices at the start of the list anyway). Any of the "a, b, *c = L" proponents want to come up with a candidate 'list.popslice' patch for 2.5? The implementation shouldn't be too hard - the "x = L[a:b:c]; del L[a:b:c]; return x" description should suffice for an implementation, too (using the C API, rather than actual Python, though). Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Hello Nick, On Sat, Nov 13, 2004 at 10:05:23PM +1000, Nick Coghlan wrote:
At the moment, list.pop is described as equivalent to:
x = L[i]; del L[i]; return x
Then the documentation is not exact: if it were really equivalent to the above definition, then we could already do today: a, b = L.pop(slice(0,2)) because starting from Python 2.3, L[slice(0,2)] is equivalent to L[0:2] So who's wrong: the documentation of list.pop() or its implementation? It would be nicely regular if we could pop slices. Armin

Armin Rigo wrote:
So who's wrong: the documentation of list.pop() or its implementation? It would be nicely regular if we could pop slices.
That's an interesting observation. I was somewhat concerned that this kind of overloading might be confusion, but given that it is even already documented (with nobody noticing), I'm now very much in favour of allowing to pop slices. Of course, whoever implements the change should make sure that slices are applicable consistently everwhere the documentation says they should be. Regards, Martin

Martin v. Löwis wrote:
Random thought: the two uses of "[" and "]" in the sequence docs can get a little confusing. . . or maybe I'm just tired ;) Anyway, the sequence and mutable sequence sections of the documentation don't reveal anything other than list.pop(). It seems to be the only normal method that accepts an index as an argument. Everything else looks to be dealt with via standard subscripts and the associated magic methods: Retrieval: x = L[i] | x = L.__getitem__(i) Setting: L[i] = x | L.__setitem__(i, x) Deletion: del L[i] | L.__delitem__(i, x) So it looks like list.pop simply got left out because it wasn't a magic method, and the idea of permitting extended slicing for it just didn't come up (well, until now).** Anyway, if this is implemented, array.pop and UserList.pop should allow slices, too, as they are described as working like list.pop. The other container classes I've checked are either immutable, or don't allow indexed access beyond the basics (string, unicode, set, deque). Cheers, Nick. ** Some items of possible historical interest: http://www.python.org/doc/2.3.4/whatsnew/section-slices.html http://sourceforge.net/tracker/?group_id=5470&atid=305470&aid=400998&func=detail http://mail.python.org/pipermail/python-dev/2002-May/023874.html -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Nick Coghlan wrote:
This isn't really true: s.index(x[, i[, j]]) return smallest k such that s[k] == x and i <= k < j s.insert(i, x) same as s[i:i] = [x] However, I don't think this naturally extends to slices: for index, there is no <= relationship for slices (atleast not a natural one), and for insert, you can use slices as start- and end-index of a slice.
Anyway, if this is implemented, array.pop and UserList.pop should allow slices, too, as they are described as working like list.pop.
Right. For UserList, it probably falls out naturally. Regards, Martin

Martin v. Löwis wrote:
Yes. Those two were on my list initially, but then I tried to figure out how using a slice would actually *work* for them. At which point, I took them back off the list - slice arguments just didn't make any sense. So I think we're down to two things to implement - list.pop and array.pop. As you say, UserList.pop should just be a different way of spelling list.pop. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Thu, 11 Nov 2004 22:50:21 +0100, Johan Hahn <johahn@home.se> wrote:
I am really late on this thread, but anyway, I've come up with another approach to solve the problem using iterators. It uses iterator that is guaranteed to always return a fixed number of elements, regardless of the size of the sequence; when it finishes, it returns the tail of the sequence as the last argument. This is a simple-minded proof of concept, and it's surely highly optimizable in at least a hundred different ways :-) def iunpack(seq, times, defaultitem=None): for i in range(times): if i < len(seq): yield seq[i] else: yield defaultitem if i < len(seq): yield seq[i+1:] else: yield () Usage is as follows:
As it is, it fails if the requested number of elements is zero, but this is not a real use case for it anyway. But the best part is yet to come. Because of the way Python implicitly packs & unpacks tuples, you can use it *without* calling tuple():
The only catch is that, if you have only one parameter, then all you will get is the generator itself.
But that's a corner case, and not the intended use anyway. Besides that, there is an issue regarding the 'times' parameter; whether it should return 'times' items plus the tail part, or 'times-1' items and the tail part. I think that it's fine the way it is. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
So the original example becomes: a, b, L = itertools.iunpack(L, 2) (and the 2 is an element count, not an index, so, as you say, there's no need to subtract 1) That's certainly quite tidy, but it has the downside that it still copies the entire list as happens in the OP's code (the copying is hidden, but it still happens - the original list isn't destroyed until the assignment of the third value returned by the iterator). It's also likely to require a trip to the docs to find out how iunpack works, and it doesn't fare well in the 'discovery' category (that is, the solution to what should be a fairly basic list operation is tucked away in itertools) If list.pop gets updated to handle slice objects, then it can modify the list in place, and avoid any copying of list elements. "a, b = L.pop(slice(2)" should be able to give even better performance than "a = L.pop(0); b = L.pop(0)" (which is, I believe, the only current approach that avoids copying the entire list). And the list operation stays where it belongs - in a method of list. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Thu, 18 Nov 2004 21:07:34 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
list.pop doesn't solve the case for when the data is stored in a tuple (which is immutable). Also, a C implementation (of the type that would be done for itertools inclusion) would certainly solve some of the performance issues. As far as the documentation being tucked away into itertools, I don't see that as a problem; in my opinion, itertools already holds a good deal of utility functions which deserve better study by novices (on a near par with builtins). And finally, I believe that the name iunpack is already quite obvious on what it means. BTW, just because map, zip, enumerate, filter & etc are in builtins doesn't mean that they are inherently more useful, or pythonic, than the functions in itertools. It's just a matter that itertools is a relatively late addition to the pack; I hope it becomes more proeminently adopted as people get used to them. -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
list.pop doesn't solve the case for when the data is stored in a tuple (which is immutable).
For immutable objects, you *have* to make a copy, so I don't see any real downside to just using: a, b, T = T[0], T[1], T[2:] While I think iunpack is kind of neat, I don't think it solves anything which is currently a major problem, as it is really just a different way of spelling the above slicing. The major portion (sans some index checking) of iunpack(T, 2) can be written on one line: a, b, T = (T[i] for i in (range(2) + [slice(2, None)])) When the number of elements to be unpacked is known at compile time (as it has to be to use tuple unpacking assignment), there seems little benefit in moving things inside a generator instead of spelling them out as a tuple of slices. However, the OP's original question related to a list, where the goal was to both read the first couple of elements *and* remove them from the list (i.e. 'pop'ing them). This is similar to the motivating use cases which came up the last time the "a, b, *c = L" idea was raised. At the moment, the main choices are to use the syntax above (and make an unnecessary copy), make multiple calls to list.pop (with associated function overhead), or call del directly after reading the slice you want (i.e. doing a list.pop of a slice in Python code). According to the current docs, L.pop(i) is defined as being equivalent to: x = L[i]; del L[i]; return x And Armin pointed out that setting "i = slice(<whatever>)" actually causes that stated equivalence to break:
So, we could either fix the docs to explicitly exclude slices from list.pop, or just fix the method so slices work as the argument. The latter is a little more work, but it does provides a nice way of spelling "read and remove" for a range of elements. Which is why I'm hoping one of the proponents of "a, b, *c = L" will be willing to provide an implementation. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

On Fri, 19 Nov 2004 20:29:45 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
(First of all, let me say that I am not at all against making list.pop accept slices. It's nice that it seems to be agreed that this is a worthy addition. And I don't intend to keep pushing for iunpack(); but I feel that is important to make a few clarifications.) If I want to do: a,b,X = T[0], T[1], T[2:] (?) The lesson here is that it can't be assumed for unpacking purposes that the programmer wanted to modify the original list. Of course, if he does want it, then list.pop just comes nice. (also, list.pop(slice) can be used to remove elements from anywhere on the list, which is a big plus)
I would rather prefer not to have to use this idiom -- it's neither obvious nor convenient. Either list.pop or unpack would be cleaner for this purpose.
For more than a few arguments, it seems to be silly to require the user to write it as: a,b,c,d,e = t[0],t[1],t[2],t[3],t[4:] [lots of remarks about list.pop snipped] I'm not against list.pop; it's just that iunpack provides a different approach to the problem. It's fairly generic: it works for both mutable and immutable lists. The implementation provided is a proof of concept. The fact that it does not modify the original arguments *is* a design feature (althought not what the OP wanted for his particular case). After all, as shown in the example above, conventional tuple unpacking on assignment doesn't change the right-side arguments either. One posible improvement for iunpack() is to accept any iterable as an argument. On return, instead of a tuple to represent the remaining elements, it would return the generator itself. This would allow for code that consume the generator in "chunks" to be written rather simply: a,b,c,g = iunpack(3, g) In this case, to solve the OP problem, instead to have all the values to be processed stored into a list, he could rely on a generator to produce values on the fly. Depending on the scenario this would come handy (less memory consumption, more responsiveness). -- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

"Carlos Ribeiro" <carribeiro@gmail.com> wrote in message news:864d37090411190724440dfe38@mail.gmail.com...
and also impossible.
but this does require that return values be captured in a temporary first. This issue has come up enough that a PEP would be useful (if there isn't one already). Terry J. Reedy

Terry Reedy wrote:
Hmm. . . _>>>t = range(10) _>>>a, b, c, d, e = t[:4] + [t[4:]] _>>>a, b, c, d, e _(0, 1, 2, 3, [4, 5, 6, 7, 8, 9]) So this actually can scale (sort of) to longer tuples. Anyway, I don't have any real objection to iunpack. If it was designed to work on any iterator (return the first few elements, then return the partially consumed iterator), I'd be +1 (since, as Carlos pointed out, slicing doesn't work for arbitrary iterators). Something like: _>>>def iunpack(itr, numitems, defaultitem=None): _... for i in range(numitems): _... try: _... yield itr.next() _... except StopIteration: _... yield defaultitem _... yield itr _>>> g = (x for x in range(10)) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(0, 1, 2, 3, <generator object at 0xa0cd02c>) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(4, 5, 6, 7, <generator object at 0xa0cd02c>) _>>> a, b, c, d, e = iunpack(g, 4) _>>> a, b, c, d, e _(8, 9, None, None, <generator object at 0xa0cd02c>) Cheers, Nick. I think we're now to the stage of violently agreeing with each other. . . -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268

Terry Reedy wrote:
This issue has come up enough that a PEP would be useful (if there isn't one already).
Nope, I had a look. I've got a draft one attached, though - it combines rejecting the syntax change with adding the other changes Carlos and I have suggested. If people think it's worth pursuing, I'll fire it off to the PEP editors. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268 PEP: XXX Title: Unpacking sequence elements using tuple assignment Version: $Revision: 1.4 $ Last-Modified: $Date: 2003/09/22 04:51:50 $ Author: Nick Coghlan <ncoghlan@email.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-Nov-2004 Python-Version: 2.5 Post-History: 20-Nov-2004 Abstract ======== This PEP addresses the repeated proposal that "a, b, *c = some_sequence" be permissible Python syntax. The PEP does NOT propose that this syntax be allowed. In fact, acceptance of this PEP will indicate agreement that the above syntax will NOT become part of Python. However, this PEP does suggest two additions that will hopefully reduce the demand for the above syntax: - modify ``list.pop`` and ``array.pop`` to also accept slice objects as the index argument - add an ``itertools.iunpack`` function that accepts an arbitrary iterator, yields a specified number of elements from it, and then yields the partially consumed iterator. Rationale ========= The proposal to extend the tuple unpacking assignment syntax has now been posted to python-dev on 3 separate occasions, and rejected on all 3 occasions.[1]_ The subsequent discussion on the most recent occasion yielded the two suggestions documented in this PEP (modifying ``list.pop`` was suggested by the PEP author, the ``itertools.iunpack`` function was suggested by Carlos Ribeiro). Modifying ``list.pop`` to accept slice objects as well as integers brings it in line with the standard sequence subscripting methods (``__getitem__``, ``__setitem__``, ``__delitem__``). It also makes ``list.pop`` consistent with its current documentation (the Python code given as equivalent to ``list.pop`` accepts slice objects, but ``list.pop`` does not). The modification of ``array.pop`` is mainly for consistency with the new behaviour of ``list.pop``. The ``itertools.iunpack`` function has the advantage of working for arbitrary iterators, not just lists or arrays. However, it always yields copies of the elements, while the ``pop`` methods are able to avoid unnecessary copying. Proposed Semantics ================== ``list.pop`` would be updated to conform to its documentation as being equivalent to:: x = L[i] del L[i] return x In Python 2.4, the above equivalence does not hold if ``i`` is a slice object. ``array.pop`` would be updated similarly. ``itertools.iunpack`` would be equivalent to the following:: def iunpack(itr, numitems, defaultitem=None): for i in range(numitems): try: yield itr.next() except StopIteration: yield defaultitem yield itr Reference Implementation ======================== As yet, no reference implementation is available for either part of the proposal. Open Issues =========== - Should ``itertools.iunpack`` call ``iter()`` on its first argument? References ========== .. [1] python-dev archives of tuple unpacking discussions (http://mail.python.org/pipermail/python-dev/2002-November/030349.html) (http://mail.python.org/pipermail/python-dev/2004-August/046684.html) (http://mail.python.org/pipermail/python-dev/2004-November/049895.html) Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 End:

On Sat, 20 Nov 2004 13:41:19 +1000, Nick Coghlan <ncoghlan@iinet.net.au> wrote:
I may be wrong (and if that's the case, I would like to be politely educated on why), but isn't it already on a PEP-worthy point? IOW, if people are not interested, then the PEP will simply be rejected, which is a good thing, because it will at least document the case. I also believe that the pre-PEP should be posted to the main Python list, where it may be beaten to death & flamed by a larger audience. I am willing to do it myself, but I assume that is more polite on my part if I ask you to do it :-) <PEP boilerplate & rationale snipped>
I though the iunpack() code in the previous section would be acceptable as a 'reference implementation'.
+1
-- Carlos Ribeiro Consultoria em Projetos blog: http://rascunhosrotos.blogspot.com blog: http://pythonnotes.blogspot.com mail: carribeiro@gmail.com mail: carribeiro@yahoo.com

Carlos Ribeiro wrote:
Good point - I've sent it to the PEP editors to request a number.
I'd wait to see what the PEP editors think, myself. I'm not entirely sure the idea is focused well enough to make a good PEP (I could argue either way, and I wrote the thing!). However, if you want to post it over there for discussion, feel free.
Unfortunately, itertools is a C module :P
We'll go with that then. . . y'know, I could have made that change before sending the PEP draft in. Ah well, we can change it later - I doubt the PEP will surive c.l.p. (or even py-dev) unscathed, anyway. Cheers, Nick. -- Nick Coghlan | Brisbane, Australia Email: ncoghlan@email.com | Mobile: +61 409 573 268
participants (9)
-
"Martin v. Löwis"
-
Armin Rigo
-
Carlos Ribeiro
-
James Y Knight
-
Johan Hahn
-
Johan Hahn
-
Nick Coghlan
-
Nick Coghlan
-
Terry Reedy