
[Guido]
I have to stand on my head to understand what it does. This is even the case for examples like
reduce(lambda x, y: x + y.foo, seq)
[Greg]
It occurs to me that, with generator expressions, such cases could be rewritten as
reduce(lambda x, y: x + y, (z.foo for z in seq))
i.e. any part of the computation that only depends on the right argument can be factored out into the generator. So I might have to take back some of what I said earlier about generator comprehensions being independent of reduce.
But if I understand you correctly, what you're saying is that the interesting cases are the ones where there isn't a ready-made binary function that does what you want, in which case you're going to have to spell everything out explicitly anyway one way or another.
(And then spelling it out so that it works with reduce() reduces clarity.)
In that case, the most you could gain from a reduce syntax would be that it's an expression rather than a sequence of statements.
But the same could be said of list comprehensions -- and *was* said quite loudly by many people in the early days, if I recall correctly. What's the point, people asked, when writing out a set of nested loops is just about as easy?
Some people still hate LC's for this reason.
Somehow we came to the conclusion that being able to write a list comprehension as an expression was a valuable thing to have, even if it wasn't significantly shorter or clearer. What about reductions? Do we feel differently? If so, why?
IMO LC's *are* significantly clearer because the notation lets you focus on what goes into the list (e.g. the expresion "x**2") and under what conditions (e.g. the condition "x%2 == 1") rather than how you get it there (i.e. the initializer "result = []" and the call "result.append(...)"). This is an incredibly common idiom in the use of loops; for experienced programmers the boilerplate disappears when they read the code, but for less experienced readers it takes more time to recognize the idiom. I think this is at least in part due to the fact that there are more details that can be written differently, e.g. the name of the result variable, and exactly at which point it is initialized. I think that for reductions the gains are less clear. The initializer for the result variable and the call that updates its are no longer boilerplate, because they vary for each use; plus the name of the result variable should be chosen carefully because it indicates what kind of result it is (e.g. a sum or product). So, leaving out the condition for now, the pattern or idiom is: <result> = <initializer> for <variable> in <iterable>: <result> = <expression> (Where <expression> uses <result> and <variable>.) If we think of this as a template with parameters, there are five parameters! (A LC without a condition only has 3: <expression>, <variable> and <iterable>.) No matter how hard you try, a macro with 5 parameters will have a hard time conveying the meaning of each without being at least as verbose as the full template. We could reduce the number of template parameters to 4 by leaving <result> anonymous; we could then refer to it by e.g. "_" in <expression>, which is more concise and perhaps acceptable, but makes certain uses more strained (e.g. mean() below). Just for fun, let me try to propose a macro syntax: reduction(<initializer>, <expression>, <variable>, <iterable>) (I think it's better to have <initializer> as the first parameter, but you can ) For example: reduction(0, _+x**2, x, S) Lavishly sprinkle syntactic sugar, and perhaps it can become this ('reduction' would have to be a reserved word): reduction(0, _+x**2 for x in S) A few more examples using this notation: # product(S), if Raymond's product() builtin is accepted reduction(1, _*x for x in S) # mean of f(x); uses result tuple and needs result postprocessing total, n = reduction((0, 0), (_[0]+f(x), _[1]+1) for x in S) mean = total/n # horner(S, x): evaluate a polynomial over x: [6, 3, 4] => 6*x**2 + 3*x + 4 reduction(0, _*x + c for c in S) In each of these cases I have the same gut response as to writing these using reduce(): the notation is too "concentrated", I have to think so hard before I understand what it does that I wouldn't mind having it spread over three lines. Compare the above four examples to: sum = 0 for x in S: sum += x**2 product = 1 for x in S: product *= x total, n = 0, 0 for x in S: total += f(x) n += 1 mean = total/n horner = 0 for c in S: horner = horner*x + c I find that these cause much less strain on the eyes. (BTW the horner example shows that insisting on augmented assignment would reduce the power.) Concluding, I think the reduce() pattern is doomed -- the template is too complex to capture in special syntax. --Guido van Rossum (home page: http://www.python.org/~guido/)