[Python-ideas] [Python-Dev] minmax() function returning (minimum, maximum) tuple of a sequence
Terry Reedy
tjreedy at udel.edu
Tue Oct 26 19:55:30 CEST 2010
On 10/26/2010 11:43 AM, Guido van Rossum wrote:
> You're right, the point I wanted to prove was that generators are
> better than threads, but the code was based on emulating reduce(). The
> generalization that I was aiming for was that it is convenient to
> write a generator that does some kind of computation over a sequence
> of items and returns a result at the end, and then have a driver that
> feeds a single sequence to a bunch such generators. This is more
> powerful once you try to use reduce to compute e.g. the average of the
> numbers fed to it -- of course you can do it using a function of
> (state, value) but it is so much easier to write as a loop! (At least
> for me -- if you do nothing but write Haskell all day I'm sure it
> comes naturally. :-)
>
> def avg():
> total = 0
> count = 0
> try:
> while True:
> value = yield
> total += value
> count += 1
> except GeneratorExit:
> raise StopIteration(total / count)
The more traditional pull or grab (rather than push receive) version is
def avg(it):
total = 0
count = 0
for value in it:
total += value
count += 1
return total/count
> The essential boilerplate here is
>
> try:
> while True:
> value = yield
> <use value>
> except GeneratorExit:
> raise StopIteration(<compute result>)
with corresponding boilersplate.
I can see that the receiving generator version would be handy when you
do not really want to package the producer into an iterator (perhaps
because items are needed for other purposes also) and want to send items
to the averager as they are produced, from the point of production.
> No doubt functional aficionados will snub this, but in Python, this
> should run much faster than the same thing written as a reduce-ready
> function, due to the function overhead (which wasn't a problem in the
> min/max example since those are built-ins).
>
> BTW This episode led me to better understand my objection against
> reduce() as the universal hammer: my problem with writing avg() using
> reduce is that the function one feeds into reduce is asymmetric -- its
> first argument must be some state, e.g. a tuple (total, count), and
> the second argument must be the next value.
Not hard:
def update(pair, item):
return pair[0]+1, pair[1]+item
> This is the moment that my
> head reliably explodes -- even though it has no problem visualizing
> reduce() using a *symmetric* function like +, min or max.
>
> Also note that the reduce() based solution would have to have a
> separate function to extract the desired result (total / count) from
> the state (total, count), and for multi_reduce() you would have to
> supply a separate list of functions for these or some other hacky
> approach.
Reduce is extremely important as concept: any function of a sequence (or
collection arbitrarily ordered) can be written as a post-processed
reduction. In practice, at least for Python, it is better thought of as
wide-spread template pattern, such as the boilerplate above, than just
as a function. This is partly because Python does not have general
function expressions (and should not!) and also because Python does have
high function call overhead (because of signature flexibility).
--
Terry Jan Reedy
More information about the Python-ideas
mailing list