itertools comments [Was: Re: RELEASED: Python 2.3a2]

Thu Feb 20 10:53:39 EST 2003

Guido van Rossum <guido at python.org> writes:

> 
> - new module itertools: efficient looping constructs
> 

I've got one general comment on the new itertools (which I think really is a
great addition to the standard library) and a few suggestions for additions.

Wouldn't `xfilter`, `xmap` etc. be a better naming convention than
`ifilter`, `imap` etc? I'd propose this change for 2 reasons:

1) the 'x' prefix is already used for `lazy` constructs (such as xrange and
   xreadlines) which are in pretty much every way equivalent to generators.

2) `i` is currently used to signify `in-place` i.e. a destructive operation,
   (e.g. __iadd__ etc.). It is often useful to have both a destructive and a
   nondestructive variant of a function, sort e.g. would be a prime example
   and I can't think of another reasonable convention in python than to prefix
   i (for in-place). 

   Admittedly, the most "pythonic" approach is to have just a single,
   destructive function that returns `None`, but this is often prohibitively
   incovinient, so that uncountable destructive `sort` functions for lists
   have been written, leading to ambiguity. A convention to distinguishes
   destructive and nondestructive functions at least avoids that, e.g.:

    def isort(l, cmpfunc=None):
        if cmpfunc:
            l.sort(cmpfunc)
        else:
            l.sort()
        return l

    def sort(seq, cmpfunc=None):
        return isort(list(seq), cmpfunc=None)

Finally, among the tools I've cooked up myself over time and that aren't
already covered by what's in the new module, I find the following 2
(especially the first) to be the most useful and worthy of addition:

def xgroup(iter,n=2):
    r"""Iterate `n`-wise (default pairwise)  over `iter`.
    Examples:
    >>> list(xgroup(range(9), 3))
    [(0, 1, 2), (3, 4, 5), (6, 7, 8)]
    """
    last = []
    for elt in iter:
        last.append(elt)
        if len(last) == n: yield tuple(last); last = []

def xwindow(iter, n=2, s=1):
    r"""Move an `n`-item (default 2) windows `s` steps (default 1) at a time
    over `iter`.

    Examples:
    >>> list(xwindow(range(6),2))
    [(0, 1), (1, 2), (2, 3), (3, 4), (4, 5)]
    """

    assert n >= s
    last = []
    for elt in iter:
        last.append(elt)
        if len(last) == n: yield tuple(last); last=last[s:]

A last thing, I think a `cycle` generator would be an really useful idiom to
have, as it provides a much more readable and less errorprone way to achieve
many of the things one normally has to use an awkward modulus construct
for. If I understand the rationale for not providing one correctly, then it is
because the `straightforward` (and non-storage allocating) implementation:

    def xcycle(seq):
        while 1:
            for item in seq: yield item

doesn't work as one might expect if `seq` is e.g. an iterator (which BTW sort
of seems to contradict the other reason for not providing it, namely that "the
tool is readily constructible using pure Python" -- it is easy to overlook
this subtleness). Wouldn't the following work, mostly without surprising
overheads?

   def xcycle(seq):
       if not operator.isSequenceType(seq):
           seq = tuple(seq) # or [x for x in seq]
       while 1:
           for item in seq: yield item

Since I'd guess that the most common usage scenario would be something like
``cycle([1,2,3])`` (for which there is no suprising storage allocation) I
don't think that that many expectations would be violated, after all.

alex