On 10/26/2010 11:43 AM, Guido van Rossum wrote:
You're right, the point I wanted to prove was that generators are better than threads, but the code was based on emulating reduce(). The generalization that I was aiming for was that it is convenient to write a generator that does some kind of computation over a sequence of items and returns a result at the end, and then have a driver that feeds a single sequence to a bunch such generators. This is more powerful once you try to use reduce to compute e.g. the average of the numbers fed to it -- of course you can do it using a function of (state, value) but it is so much easier to write as a loop! (At least for me -- if you do nothing but write Haskell all day I'm sure it comes naturally. :-)
def avg(): total = 0 count = 0 try: while True: value = yield total += value count += 1 except GeneratorExit: raise StopIteration(total / count)
The more traditional pull or grab (rather than push receive) version is def avg(it): total = 0 count = 0 for value in it: total += value count += 1 return total/count
The essential boilerplate here is
try: while True: value = yield <use value> except GeneratorExit: raise StopIteration(<compute result>)
with corresponding boilersplate. I can see that the receiving generator version would be handy when you do not really want to package the producer into an iterator (perhaps because items are needed for other purposes also) and want to send items to the averager as they are produced, from the point of production.
No doubt functional aficionados will snub this, but in Python, this should run much faster than the same thing written as a reduce-ready function, due to the function overhead (which wasn't a problem in the min/max example since those are built-ins).
BTW This episode led me to better understand my objection against reduce() as the universal hammer: my problem with writing avg() using reduce is that the function one feeds into reduce is asymmetric -- its first argument must be some state, e.g. a tuple (total, count), and the second argument must be the next value.
Not hard: def update(pair, item): return pair[0]+1, pair[1]+item
This is the moment that my head reliably explodes -- even though it has no problem visualizing reduce() using a *symmetric* function like +, min or max.
Also note that the reduce() based solution would have to have a separate function to extract the desired result (total / count) from the state (total, count), and for multi_reduce() you would have to supply a separate list of functions for these or some other hacky approach.
Reduce is extremely important as concept: any function of a sequence (or collection arbitrarily ordered) can be written as a post-processed reduction. In practice, at least for Python, it is better thought of as wide-spread template pattern, such as the boilerplate above, than just as a function. This is partly because Python does not have general function expressions (and should not!) and also because Python does have high function call overhead (because of signature flexibility). -- Terry Jan Reedy