[Python-Dev] Proposal for a new itertools function: iwindow
Raymond Hettinger
raymond.hettinger at verizon.net
Sun May 28 05:04:11 CEST 2006
From: "Nick Coghlan" <ncoghlan at gmail.com>
> A python-dev Google search for "itertools window" found me your original
> suggestion to include Jack Diedrich's itertools.window in Python 2.3 (which
> was only deferred because 2.3 was already past beta 1 at that point).
>
> I couldn't find any discussion of the idea after that (aside from your
> pointing out that you'd never found a real-life use case for the pairwise()
> recipe in the docs, which is a basic form of windowing).
>
> One option would be to add a windowing function to the recipes in the
> itertools docs. Something like:
>
>
> def window(iterable, window_len=2, window_step=1):
> iterators = tee(iterable, window_len)
> for skip_steps, itr in enumerate(iterators):
> for ignored in islice(itr, skip_steps):
> pass
> window_itr = izip(*iterators)
> if window_step != 1:
> window_itr = islice(window_itr, step=window_step)
> return window_itr
No thanks. The resolution of this one was that windowing iterables is not a
good idea. It is the natural province of sequences, not iterables. With
sequences, it is a matter of using an index and offset. With iterables, there
is a great deal of data shifting. Also note that some of the uses are subsumed
by collections.deque(). In implementing a draft of itertools window, we found
that the data shifting was unnatural and unavoidable (unless you output some
sort of buffer instead of a real tuple). Also, we looked at use cases and found
that most had solutions that were dominated by some other approach. The
addition of a windowing tool would tend to steer people away from better
solutions. In short, after much deliberation and experimenting with a sample
implementation, the idea was rejected. Hopefully, it will stay dead and no one
will start a cruscade for it simply because it can be done and because it seems
cute.
The thought process was documented in a series of newsgroup postings:
http://groups.google.com/group/comp.lang.python/msg/026da8f9eec4becf
The SF history is less informative because most of the discussions were held by
private email:
http://www.python.org/sf/756253
If someone aspires to code some new itertools, I have approved two new ones,
imerge() and izip_longest(). The pure python code for imerge() is at:
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/491285
The impetus behind izip_longest() was expressed in a newsgroup thread,
http://groups.google.com/group/comp.lang.python/browse_thread/thread/f424f63bfdb77c4/38c31991133757f7 .
The discussion elicted neither broad support, nor condemnation. Also, the use
cases were sparse. Ultimately, I was convinced by encountering a couple of
natural use cases. Another thought is that while other solutions are usually
available for any given use case, there is a natural tendency to reach for a
zip-type tool whenever presented with lock-step iteration issues, even when the
inputs are of uneven length. Note, that the signature for izip_longest() needs
to include an optional pad value (defaulting to None) -- there are plenty of use
cases where an empty string, zero, or a null object would be a preferred pad
value.
Cheers,
Raymond
More information about the Python-Dev
mailing list