[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
Steven D'Aprano
steve at pearwood.info
Tue Oct 26 12:10:59 CEST 2010
Guido van Rossum wrote:
> This should not require threads.
>
> Here's a bare-bones sketch using generators:
>
> def reduce_collector(func):
> outcome = None
> while True:
> try:
> val = yield
> except GeneratorExit:
> break
> if outcome is None:
> outcome = val
> else:
> outcome = func(outcome, val)
> raise StopIteration(outcome)
>
> def parallel_reduce(iterable, funcs):
> collectors = [reduce_collector(func) for func in funcs]
> values = [None for _ in collectors]
> for i, coll in enumerate(collectors):
> try:
> next(coll)
> except StopIteration as err:
> values[i] = err.args[0]
> collectors[i] = None
> for val in iterable:
> for i, coll in enumerate(collectors):
> if coll is not None:
> try:
> coll.send(val)
> except StopIteration as err:
> values[i] = err.args[0]
> collectors[i] = None
> for i, coll in enumerate(collectors):
> if coll is not None:
> try:
> coll.throw(GeneratorExit)
> except StopIteration as err:
> values[i] = err.args[0]
> return values
Perhaps I'm missing something, but to my mind, that's an awfully
complicated solution for such a simple problem. Here's my attempt:
def multi_reduce(iterable, funcs):
it = iter(iterable)
collectors = [next(it)]*len(funcs)
for i, f in enumerate(funcs):
x = next(it)
collectors[i] = f(collectors[i], x)
return collectors
I've called it multi_reduce rather than parallel_reduce, because it
doesn't execute the functions in parallel. By my testing on Python
3.1.1, multi_reduce is consistently ~30% faster than the generator based
solution for lists with 1000 - 10,000,000 items.
So what am I missing? What does your parallel_reduce give us that
multi_reduce doesn't?
--
Steven
More information about the Python-ideas
mailing list