[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
Guido van Rossum
guido at python.org
Mon Oct 25 04:37:32 CEST 2010
On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan <lie.1296 at gmail.com> wrote:
> There is a way, by using threading, and injecting a thread-safe tee into
> max/min/otherFuncs (over half of the code is just for implementing
> thread-safe tee). Using this, there is no need to make any modification
> to min/max. I suppose it might be possible to convert this to using the
> new coroutine proposal (though I haven't been following the proposal
> close enough).
>
> The code is quite slow (and ugly), but it can handle large generators
> (or infinite generators). The memory shouldn't grow if the functions in
> *funcs takes more or less similar amount of time (which is true in case
> of max and min); if *funcs need to take both very fast and very slow
> codes at the same time, some more code can be added for load-balancing
> by stalling faster threads' request for more items, until the slower
> threads finishes.
>
> Pros:
> - no modification to max/min
>
> Cons:
> - slow, since itertools.tee() is reimplemented in pure-python
> - thread is uninterruptible
[snip]
This should not require threads.
Here's a bare-bones sketch using generators:
def reduce_collector(func):
outcome = None
while True:
try:
val = yield
except GeneratorExit:
break
if outcome is None:
outcome = val
else:
outcome = func(outcome, val)
raise StopIteration(outcome)
def parallel_reduce(iterable, funcs):
collectors = [reduce_collector(func) for func in funcs]
values = [None for _ in collectors]
for i, coll in enumerate(collectors):
try:
next(coll)
except StopIteration as err:
values[i] = err.args[0]
collectors[i] = None
for val in iterable:
for i, coll in enumerate(collectors):
if coll is not None:
try:
coll.send(val)
except StopIteration as err:
values[i] = err.args[0]
collectors[i] = None
for i, coll in enumerate(collectors):
if coll is not None:
try:
coll.throw(GeneratorExit)
except StopIteration as err:
values[i] = err.args[0]
return values
def main():
it = range(100)
print(parallel_reduce(it, [min, max]))
if __name__ == '__main__':
main()
--
--Guido van Rossum (python.org/~guido)
More information about the Python-ideas
mailing list