Re: [Python-ideas] [Python-Dev] Proposal: new list function: pack

Yes it's true you can do easily the pack part with zip(*[iter(l)]*size) you can do the slicing with zip(*[l[i:len(l)-(slice-1-i)] for i in range(slice)]) And you could also do the twice but you get something more complicated. It's also true that with izip you could get iterator. I use this pack function a lot of times in my code, and it's more readable, than the zip version. After, the thing it's to know if people really use this kind of function on list of it's just me (that's it's totally possible). paul bedaride On Fri, Mar 20, 2009 at 3:07 PM, Isaac Morland <ijmorlan@uwaterloo.ca> wrote:
On Fri, 20 Mar 2009, paul bedaride wrote:
I propose a new function for list for pack values of a list and sliding over them:
then we can do things like this: for i, j, k in pack(range(10), 3, partialend=False): print i, j, k
I propose this because i need a lot of times pack and slide function over list and this one combine the two in a generator way.
See the Python documentation for zip():
http://docs.python.org/library/functions.html#zip
And this article in which somebody independently rediscovers the idea:
http://drj11.wordpress.com/2009/01/28/my-python-dream-about-groups/
Summary: except for the "partialend" parameter, this can already be done in a single line. It is not for me to say whether this nevertheless would be useful as a library routine (if only perhaps to make it easy to specify "partialend" explicitly).
It seems to me that sometimes one would want izip instead of zip. And I think you could get the effect of partialend=True in 2.6 by using izip_longest (except with an iterator result rather than a list).
def pack(l, size=2, slide=2, partialend=True): lenght = len(l) for p in range(0,lenght-size,slide): def packet(): for i in range(size): yield l[p+i] yield packet() p = p + slide if partialend or lenght-p == size: def packet(): for i in range(lenght-p): yield l[p+i] yield packet()
Isaac Morland CSCF Web Guru DC 2554C, x36650 WWW Software Specialist

I propose a new function for list for pack values of a list and sliding over them:
then we can do things like this: for i, j, k in pack(range(10), 3, partialend=False): print i, j, k . . . def pack(l, size=2, slide=2, partialend=True): lenght = len(l) for p in range(0,lenght-size,slide): def packet(): for i in range(size): yield l[p+i] yield packet() p = p + slide if partialend or lenght-p == size: def packet(): for i in range(lenght-p): yield l[p+i] yield packet()
This has been discussed before and rejected. There were several considerations. The itertools recipes already include simple patterns for grouper() and pairwise() that are easy to use as primitives in your code or to serve as models for variants. The design of pack() itself is questionable. It attempts to be a Swiss Army Knife by parameterizing all possible variations (length of window, length to slide, and how to handle end-cases). This design makes the tool harder to learn and use, and it makes the implementation more complex. That complexity isn't necessary. Use cases would typically fall into grouper cases where the window length equals the slide length or into cases that slide one element at a time. You don't win anything by combining the two cases except for more making the tool harder to learn and use. The pairwise() recipe could be generalized to larger windows, but seemed like less of a good idea after closely examining potential use cases. For cases that used a larger window, there always seemed to be a better solution than extending pairwise(). For instance, a twenty-day moving average is better implemented with a deque(maxlen=20) and a running sum than with an iterator returning tuples of length twenty -- that approach does a lot of unnecessary work shifting elements in the tuple, turning an O(n) process into an O(m*n) process. For short windows, like pairwise() itself, the issue is not one of total running time; instead, the problem is that almost every proposed use case was better coded as a simple Python loop, saving the value previous values with a step like: oldvalue = value. Having pairwise() or tripletwise() tended to be a distraction away from better solutions. Also, the pure python approach was more general as it allowed accumulations: total += value. While your proposed function has been re-invented a number of times, that doesn't mean it's a good idea. It is more an exercise in what can be done, not in what should be done. Raymond

Now I discover itertools I thing your are right, but maybe the pack function could be rename iwinslice (at the end it's its real name), and add it to itertools ?? paul bedaride On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger <python@rcn.com> wrote:
I propose a new function for list for pack values of a list and sliding over them:
then we can do things like this: for i, j, k in pack(range(10), 3, partialend=False): print i, j, k
. . .
def pack(l, size=2, slide=2, partialend=True): lenght = len(l) for p in range(0,lenght-size,slide): def packet(): for i in range(size): yield l[p+i] yield packet() p = p + slide if partialend or lenght-p == size: def packet(): for i in range(lenght-p): yield l[p+i] yield packet()
This has been discussed before and rejected.
There were several considerations. The itertools recipes already include simple patterns for grouper() and pairwise() that are easy to use as primitives in your code or to serve as models for variants. The design of pack() itself is questionable. It attempts to be a Swiss Army Knife by parameterizing all possible variations (length of window, length to slide, and how to handle end-cases). This design makes the tool harder to learn and use, and it makes the implementation more complex. That complexity isn't necessary. Use cases would typically fall into grouper cases where the window length equals the slide length or into cases that slide one element at a time. You don't win anything by combining the two cases except for more making the tool harder to learn and use.
The pairwise() recipe could be generalized to larger windows, but seemed like less of a good idea after closely examining potential use cases. For cases that used a larger window, there always seemed to be a better solution than extending pairwise(). For instance, a twenty-day moving average is better implemented with a deque(maxlen=20) and a running sum than with an iterator returning tuples of length twenty -- that approach does a lot of unnecessary work shifting elements in the tuple, turning an O(n) process into an O(m*n) process.
For short windows, like pairwise() itself, the issue is not one of total running time; instead, the problem is that almost every proposed use case was better coded as a simple Python loop, saving the value previous values with a step like: oldvalue = value. Having pairwise() or tripletwise() tended to be a distraction away from better solutions. Also, the pure python approach was more general as it allowed accumulations: total += value. While your proposed function has been re-invented a number of times, that doesn't mean it's a good idea. It is more an exercise in what can be done, not in what should be done.
Raymond

iwinslice() is just as bad of a name as any of the others. I have seen the equivalent of window(iterator, size=2, step=1), which works as you would expect (both as the output, as well as the implementation), with size and step both limited to 5 (because if you are doing things with more than 5 items at a time...you probably really want something else, and in certain cases, you can use multiple window calls to compose larger groups). I'd be a -0 on the feature, because as Raymond says, it's trivial to implement with a deque. And as I've said before, not all x line functions should be built-in. - Josiah On Fri, Mar 20, 2009 at 10:32 AM, paul bedaride <paul.bedaride@gmail.com> wrote:
Now I discover itertools I thing your are right, but maybe the pack function could be rename iwinslice (at the end it's its real name), and add it to itertools ??
paul bedaride
On Fri, Mar 20, 2009 at 6:01 PM, Raymond Hettinger <python@rcn.com> wrote:
I propose a new function for list for pack values of a list and sliding over them:
then we can do things like this: for i, j, k in pack(range(10), 3, partialend=False): print i, j, k
. . .
def pack(l, size=2, slide=2, partialend=True): lenght = len(l) for p in range(0,lenght-size,slide): def packet(): for i in range(size): yield l[p+i] yield packet() p = p + slide if partialend or lenght-p == size: def packet(): for i in range(lenght-p): yield l[p+i] yield packet()
This has been discussed before and rejected.
There were several considerations. The itertools recipes already include simple patterns for grouper() and pairwise() that are easy to use as primitives in your code or to serve as models for variants. The design of pack() itself is questionable. It attempts to be a Swiss Army Knife by parameterizing all possible variations (length of window, length to slide, and how to handle end-cases). This design makes the tool harder to learn and use, and it makes the implementation more complex. That complexity isn't necessary. Use cases would typically fall into grouper cases where the window length equals the slide length or into cases that slide one element at a time. You don't win anything by combining the two cases except for more making the tool harder to learn and use.
The pairwise() recipe could be generalized to larger windows, but seemed like less of a good idea after closely examining potential use cases. For cases that used a larger window, there always seemed to be a better solution than extending pairwise(). For instance, a twenty-day moving average is better implemented with a deque(maxlen=20) and a running sum than with an iterator returning tuples of length twenty -- that approach does a lot of unnecessary work shifting elements in the tuple, turning an O(n) process into an O(m*n) process.
For short windows, like pairwise() itself, the issue is not one of total running time; instead, the problem is that almost every proposed use case was better coded as a simple Python loop, saving the value previous values with a step like: oldvalue = value. Having pairwise() or tripletwise() tended to be a distraction away from better solutions. Also, the pure python approach was more general as it allowed accumulations: total += value. While your proposed function has been re-invented a number of times, that doesn't mean it's a good idea. It is more an exercise in what can be done, not in what should be done.
Raymond
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas

Josiah Carlson wrote:
iwinslice() is just as bad of a name as any of the others.
I have seen the equivalent of window(iterator, size=2, step=1), which works as you would expect (both as the output, as well as the implementation), with size and step both limited to 5 (because if you are doing things with more than 5 items at a time...you probably really want something else, and in certain cases, you can use multiple window calls to compose larger groups).
Oops, I didn't realise this thread had moved over here, so I just repeated what yourself and Raymond said over on python-dev. Oh well...
I'd be a -0 on the feature, because as Raymond says, it's trivial to implement with a deque. And as I've said before, not all x line functions should be built-in.
That does raise the possibility of adding "iterator windowing done right" by including a deque based implementation in itertools (or at least in the itertools recipes page). For example, the following continuously yields the same deque, but each time the contents represent a new window onto the underlying data: from collections import deque def window (iterable, size=2, step=1, overlap=0): itr = iter(iterable) new_per_window = size - overlap contents = deque(islice(itr, 0, size*step, step), size) while True: yield contents new_data = list(islice(itr, 0, new_per_window*step, step)) if len(new_data) < new_per_window: break contents.extend(new_data) (There are other ways of doing it that involve less data copying, but the above way seems to be the most straightforward) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------
participants (4)
-
Josiah Carlson
-
Nick Coghlan
-
paul bedaride
-
Raymond Hettinger