[Python-ideas] Adding some standalone iterator/sequence functions as methods of the iterator objects

Wed Aug 12 20:02:22 CEST 2009

Hello.

There are some functions in the standard library that take an
iterator/sequence as parameter and return an iterator. Most of them
are in the itertools module and some are built in functions. I think
they should be added as methods of the iterator objects as well. For
example:

itertools.takewhile(pred, seq) --> seq.takewhile(pred)

sorted(seq, key=keyfun, reverse=reverse) --> seq.sorted(keyfun, reverse)

Rationale:
=======

First, I know the rationale behind standalone functions like len as
opposed to methods, but I think some iterator functions are special
cases. I believe it is a common pattern to arrange these kind of
functions in a pipe-filter system to perform complex queries over
collections. The current system of standalone functions creates code
difficult to read with nested parenthesis:

...fun4(param, fun3(param, fun2(param, fun1(param, seq))))...

It is very hard to see the pipe-filter flow in this code. The case is
even worse because in some functions the order of the sequence
argument and other parameters vary. For example: sorted takes the
sequence first and then the key and reverse parameters while
itertools.takewhile takes the predicate first and then the sequence.

A few months ago, Donald 'Paddy' McCarthy suggested a pipe function
[0] in the itertools module. But I believe using methods creates a
better work flow, for example:

seq.fun1(param).fun2(param).fun3(param).fun4(param)

[0] http://mail.python.org/pipermail/python-ideas/2009-May/004877.html

Examples:
========

Example 1. I want two groups of employees with the two best salaries:

Using current functions:

groups = itertools.islice(itertools.groupby(sorted(employees,
key=lambda e: e.salary, reverse=True), lambda e: e.salary), None, 2)

Using methods:

groups = employees.sorted(lambda e: e.salary,
reverse=True).groupby(lambda e: e.salary).slice(None, 2)

Example 2. I want the pairs of programmers assigned by task:

Using current functions:

pairs_tasks = itertools.izip(itertools.cycle(itertools.combinations(programmers,
2)), tasks)

Using methods:

pars_tasks = programmers.combinations(2).cycle().izip(tasks)

Probably is better to keep izip as a standalone function:

pars_tasks = itertools.izip(programmers.combinations(2).cycle(), tasks)

Precedent:
========

There is another case where the pipe-filter pattern is seen in Python:
strings. There are a lot of functions in the string module that take
strings as argument and returns a string. Those functions could be
arranged in a pipe-filter system. Python has a history of adding
functions from the string module to the string objects. I think the
same could be done with iterator functions.

Example:

We can use:

parts = text.lower().strip().split()

As opposed to:

parts = string.split(string.strip(string.lower(text)))

That's all for now. If you think this is a good idea we could
elaborate on which methods should be added.

Hope to see your comments.

Manuel Cerón.