[Python-Dev] Re: [Python-checkins] python/dist/src/Python bltinmodule.c,,

Alex Martelli aleaxit at yahoo.com
Sun Oct 26 12:48:30 EST 2003

On Sunday 26 October 2003 05:21 pm, Paul Moore wrote:
> I *think* I see what you're getting at here, but I'm struggling to
> follow in the absence of concrete use cases. As we're talking about

Assuming the simplest definition, equivalent to:

def loop(bound_method, it):
    for item in it: bound_method(item)

typical simple use cases might be, e.g.:

Merge a stream of dictionaries into one dict:

   merged_dict = {}
   loop(merged_dict.update, stream_of_dictionaries)

rather than:

merged_dict = {}
for d in stream_of_dictionaries:

Add a bunch of sequences into one list:

    all_the_seqs = []
    loop(all_the_seqs.extend, bunch_of_seqs)

rather than:

all_the_seqs = []
for s in bunch_of_seqs:

Add two copies of each of a bunch of sequences ditto:

    all_the_seqs = []
    loop(all_the_seqs.extend, s+s for s in bunch_of_seqs)

ditto but only for sequences which have 23 somewhere in them:

    seqs_with_23 = []
    loop(seqs_with_23.extend, s for s in bunch_of_seqs in 23 in s)

and so on.  There are no doubt possibly more elegant ways, e.g.

def init_and_loop(initvalue, unboundmethod, it, *its):
    for items in itertools.izip(it, *its):
        unboundmethod(initvalue, *items)
    return initvalue

which would allow, e.g.,

merged_dict = init_and_loop({}, dict.update, stream_of_dictionaries)

or other variants yet, but the use cases are roughly the same.

The gain of such tiny "accumulator functions" (consuming one or
more iterators by just passing their items to some mutating-method
and ignoring the latter's results) are essentially conceptual -- it's
not a matter of saving a couple of lines at the point of use, nor of
saving some "bananoseconds" if the accumulator functions are
implemented in C, when compared to the coded-out loops.

Rather, such functions would allow "more declarative style"
presentation (of underlying functionality that remains imperative):
expressions feel more "declarative", stylistically, to some, while
spelling a few steps out feels more "imperative".  We've had this
preference explained recently on this group, and others over in
c.l.py are breaking over the champagne at the news of list.sorted
for essentially the same motivation.

> library functions, I'd suggest that your suggested "accumulator
> functions" start their life as an external module - maybe even in
> Python, although I take our point about the speed advantages of C.

Absolutely.  It's not _my_ suggestion to have more accumulator
functions -- it came up repeatedly on the threads started by Peter
Norvig original proposal about accumulation, and Guido mentioned
them in the 'product' thread I believe (where we also discussed
'any', 'all' etc, if I recall correctly).  I don't think anybody's ever
thought of making these built-ins.  But if that external module[s] (one
or more) is/are not part of the Python 2.4 library, if 2.4 does not
come with a selection of accumulation functions [not necessarily
including that 'loop' &c above mentioned, though I think something
like that might help], I don't think we can have the "accumulation
functionality" -- we only have great ways to make and express
iterators but not many great ways to _consume_ them (and most
particularly if sum, one of the few "good iterator consumers" we
have, is practically unusable for iterators whose items are lists..).

> With a bit of "real life" use, migration into the standard library
> might be more of an obvious step.

You could have said the same of itertools before 2.3, but I think
it was a great decision to accept them into the standard library
instead; 2.3 would be substantially poorer without them.  With an
even richer complement of iterator tools in itertools, and the new
"generator expressions" to give us even more great ways to make
iterators, I think a module of "iterator consumers", also known as
accumulation functions, would be a good idea.  Look at Peter
Norvig's original ideas for some suggestions, for example.

Which reminds me of an issue with Top(10), but, this is a long
enough post, so I think I should write a separate one about that.


More information about the Python-Dev mailing list