[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence

Guido van Rossum guido at python.org
Mon Oct 25 04:37:32 CEST 2010


On Sun, Oct 24, 2010 at 2:59 PM, Lie Ryan <lie.1296 at gmail.com> wrote:
> There is a way, by using threading, and injecting a thread-safe tee into
> max/min/otherFuncs (over half of the code is just for implementing
> thread-safe tee). Using this, there is no need to make any modification
> to min/max. I suppose it might be possible to convert this to using the
> new coroutine proposal (though I haven't been following the proposal
> close enough).
>
> The code is quite slow (and ugly), but it can handle large generators
> (or infinite generators). The memory shouldn't grow if the functions in
> *funcs takes more or less similar amount of time (which is true in case
> of max and min); if *funcs need to take both very fast and very slow
> codes at the same time, some more code can be added for load-balancing
> by stalling faster threads' request for more items, until the slower
> threads finishes.
>
> Pros:
> - no modification to max/min
>
> Cons:
> - slow, since itertools.tee() is reimplemented in pure-python
> - thread is uninterruptible
[snip]

This should not require threads.

Here's a bare-bones sketch using generators:

def reduce_collector(func):
    outcome = None
    while True:
        try:
            val = yield
        except GeneratorExit:
            break
        if outcome is None:
            outcome = val
        else:
            outcome = func(outcome, val)
    raise StopIteration(outcome)

def parallel_reduce(iterable, funcs):
    collectors = [reduce_collector(func) for func in funcs]
    values = [None for _ in collectors]
    for i, coll in enumerate(collectors):
        try:
            next(coll)
        except StopIteration as err:
            values[i] = err.args[0]
            collectors[i] = None
    for val in iterable:
        for i, coll in enumerate(collectors):
            if coll is not None:
                try:
                    coll.send(val)
                except StopIteration as err:
                    values[i] = err.args[0]
                    collectors[i] = None
    for i, coll in enumerate(collectors):
        if coll is not None:
            try:
                coll.throw(GeneratorExit)
            except StopIteration as err:
                values[i] = err.args[0]
    return values

def main():
    it = range(100)
    print(parallel_reduce(it, [min, max]))

if __name__ == '__main__':
    main()

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list