[Python-ideas] Rehabilating reduce (as "fold")

Fri Jul 12 10:03:52 CEST 2013

From: Nick Coghlan <ncoghlan at gmail.com>
Sent: Thursday, July 11, 2013 11:01 PM

>So, what if we instead added a new alternative API based on Haskell's "fold" [1] where the initial value is *mandatory*:
>
>    def fold(op, start, iterable):
>        ...

Note that Haskell, and many other functional languages, actually have both functions:

    def fold(op, start, iterable):

    def fold1(op, iterable):

And really, they're only separate functions because a language with strict types and automatic currying can't handle variable arguments.

Meanwhile, there are an awful lot of people who just don't like reduce/fold in any situation. The quote "Inside every reduce is a loop trying to get out" appears quite frequently, on this list and elsewhere. And I don't think it's because it's easy to get the fold/fold1 distinction wrong, but because they consider any use of reduce unreadable. I think the idea is that folding only makes immediate sense if you're thinking of your data structures recursively instead of iteratively, which you usually aren't in Python. But I'm probably not the best one to characterize the objection, since I don't share it. (Of course there are cases where reduce _is_ unreadable, and the only reason people use it is because in Haskell or OCaml or Scheme the explicit loop would be _more_ unreadable, even though that isn't even remotely true in Python… but there are also cases where it makes sense to me.)

One more thing: The name "fold" to me really implies there's going to be "foldr" function as well, in a way that "reduce" doesn't. But I could probably get over that—after all, right-folding isn't nearly as important for code with arrays or iterators the same way it is for recursive code with cons lists or lazy lists.

>I'd personally be in favour of the notion of also allowing strings as the first argument, so you could instead write:

>
>    data = fold("+=", [], iterables)

I like this idea, but only if it's added to other functions in the stdlib where it makes sense, and easy to add to new functions of your own.

>This could also be introduced as an alternative API in functools.
>
>(Independent of this idea, it would actually be nice if the operator module had a dictionary mapping from op symbols to names, like operator.by_symbol["+="] giving operator.iadd)

And that answers the "easy to add to new functions" bit! 

Except a helper function might be nice, something like operator.get_op (but with a better name):

    def get_op(func):
        if callable(func):
            return func
        else:
            return by_symbol[func]