[Python-ideas] Function composition (was no subject)

Andrew Barnert abarnert at yahoo.com
Sun May 10 10:54:48 CEST 2015


On May 9, 2015, at 21:58, Douglas La Rocca <larocca at abiresearch.com> wrote:
> 
> (Newcomer here.)
> 
> I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well.
> 
> The fn and pymonad modules implement infix composition functions through overloading but I've found this to be unworkable.
> 
> For me, the ideal infix operator would simply be a space, with the composition wrapped in parentheses. So e.g.
> 
>>>> (list str sorted)(range(10))
> [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ',', ',', ',', ',', ',', ',', ',', ',', ',', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '[', ']']
> 
> I might be overlooking something, but it seems to me this would work with existing syntax and semantics and wouldn't conflict with anything else like operator overloading would. The only other place non-indentation level spaces are significant is with keywords which can't be re-assigned. So e.g. (yield from gen()) wouldn't be parsed as 3 functions, and (def func) would raise SyntaxError.
> 
> Here's the composition function I'm working with, stripped of the little debugging helpers:
> 
> ```
> def compose(*fns):
>    def compose_(*x):
>        fn, *fns = fns
>        value = fn(*x)
>        if fns:
>            return compose(*fns)(value)
>        else:
>            return value
>    return compose_
> 
> O=compose
> ```
> 
> I haven't had any issues with the recursion. The `O` alias rubs me the wrong way but seemed to make sense at the time. The thought was that it should look like an operator because it acts like one.
> 
> So the use looks like
> 
>>>> O(fn1, fn2, fn3, ...)('string to be piped')
> 
> The problem for composition is essentially argument passing and has to do with the convenience of *args, **kwargs.
> 
> The way to make composition work predictably is to curry the functions yourself, wrapping the arguments you expect to get with nested closures, then repairing the __name__ etc with functools.wraps or update_wrapper in the usual way. This looks much nicer and almost natural when you write it with lambdas, e.g.
> 
>>>> getitem = lambda item: lambda container: container[item]
> 
> (Apologies for having named that lambda there...)

I understand why you named it; I don't understand why you didn't just use def if you were going to name it (and declare it in a statement instead of the middle of an expression). Anyway, this is already in operator, as itemgetter, and it's definitely useful to functional code, especially itertools-style generator-driven functional code. And it feels like the pattern ought to be generalizable... but other than attrgetter, it's hard to think of another example where you want the same thing. After all, Python only has a couple of syntactic forms that you'd want to wrap up as functions at all, so it only has a couple of syntactic forms that you'd want to wrap up as curried functions.

> The other way to manage passing values from one function to the next is to define a function like
> 
> def star(x):
>    return lambda fn: fn(*x)
> 
> Then if you get a list at one point in the pipeline and your function takes *args, you can decorate the function and call it like 
> 
>>>> star(getattr)((getattr, '__name__'))
> 'getattr'
> 
> I've run into problems using the @curried decorators from the fn and pymonad modules because they don't how to handle *args, i.e. when to stop collecting arguments and finally make the function call.
> 
> If you want to have the composition order reversed you could decorate the definition with
> 
> ```
> def flip(f):
>    def flip_(*x):
>        f(*reversed(x))
>    return flip_
> ```
> 
> Once we have composition we can write partials for `map`, `filter`, and `reduce`, but with a small twist: make them variadic in the first argument and pass the arguments to compose:
> 
> def fmap(*fn):
>    def fmap_(x):
>        return list(map(compose(*fn),x))
>    return fmap_

I don't understand why this is called fmap. I see below that you're not implying anything like Haskell's fmap (which confused me...), but then what _does_ the f mean? It seems like this is just a manually curried map, that returns a list instead of an iterator, and only takes one iterable instead of one or more. None of those things say "f" to me, but maybe I'm still hung up on expecting it to mean "functor" and I'll feel like an idiot once you clear it up. :)

Also, why _is_ it calling list? Do your notions of composition and currying not play well with iterators? If so, that seems like a pretty major thing to give up. And why isn't it variadic in the iterables? You can trivially change that by just having the wrapped function take and pass *x, but I assume there's some reason you didn't?

> def ffilter(fn):
>    def ffilter_(xs):
>        return list(filter(fn, xs))
>    return ffilter_
> 
> def freduce(fn):
>    def _freduce(xs):
>        return reduce(fn, xs)
>    return _freduce

These two aren't variadic in fn like fmap was. Is that just a typo, or is there a reason not to be?

> def Fmap(*fns):
>    def Fmap_(x):
>        return list(map(lambda fn:fn(x), fns))
>    return Fmap_
> 
> The `Fmap` function seemed like some sort of "conjugate" to `fmap` so I tried to give it name suggesting this (again, at the expense of abusing naming conventions).
> 
> Instead of mapping a function over a iterable like `fmap`, `Fmap` applies a each given function to a value. So
> 
>>>> Fmap(add(1), sub(1))(1)
> [2, 0]
> 
> I've called them `fmap`, `ffilter`, and `freduce` but don't much like these names as they imply they might be the same as Haskell's `fmap`, and they're not. And there's no way to make them anything like Haskell as far as I can tell and they shouldn't be. If these implement a "paradigm" it's not purely functional but tacit/concatenative.
> 
> It made sense to compose the passed arguments because there's no reason to pass anything else to `fmap` in the first call. So sequential calls to (the return value of) `fmap` inside a pipeline, like
> 
>>>> O(mul(10),
> ...   fmap(add(1)), 
> ...   fmap(mul(2))
> ...  )([1])
> [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]                                          
> 
> can instead be written like
> 
>>>> O(mul(10),
> ...   fmap(add(1), 
> ...        mul(2))
> ...  )([1])
> [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
> 
> It also makes it easier to work at different levels inside nested structures. In these heavily nested cases the composition pipeline even begins to resemble the data structure passing through, which makes sense.
> 
> As another example, following is part of a pipeline that takes strings of bullet-separated strings of "key:value" pairs and converts each one to a dictionary, then folds the result together:
> 
>>>> d = [' foo00 : bar00 • foo01 : bar01 ',
> ...      ' foo10 : bar10 • foo11 : bar11 ', 
> ...      ' foo20 : bar10 • foo21 : bar21 ',]
> 
>>>> dict_foldl = freduce(lambda d1, d2: dict(d1, **d2))
>>>> strip = lambda x: lambda s: s.strip(x)
>>>> split = lambda x: lambda s: s.split(x)
> 
>>>> f = O(fmap(strip(' '),
> ...            split('•'), 
> ...            fmap(split(':'),
> ...                 strip(' '), 
> ...                 tuple), 
> ...            tuple,
> ...            dict),
> ...       dict_foldl)

Now that we have a concrete example... This looks like a nifty translation of what you might write in Haskell, but it doesn't look at all like Python to me. 

And compare:

    def f(d):
        pairs = (pair.strip(' ').split(':') for pair in d.split('•'))
        strippedpairs = ((part.strip(' ') for part in pair) for pair in pairs)
        return dict(strippedpairs)

Or, even better:

    def f(d):
        pairs = (pair.strip(' ').split(':') for pair in d.split('•'))
        return {k.strip(' '): v.strip(' ') for k, v in pairs}

Of course I skipped a lot of steps--turning the inner iterables into tuples, then into dicts, then turning the outer iterable into a list, then merging all the dicts, and of course wrapping various subsets of the process up into functions and calling them--but that's because those steps are unnecessary. We have comprehensions, we have iterators, why try to write for Python 2.2?

And notice that any chain of iterator transformations like this _could_ be written as a single expression. But the fact that it doesn't _have_ to be--that you can take any step you want and name the intermediate iterable without having to change anything (and with negligible performance cost), and you can make your code vertical and play into Python indentation instead of writing it horizontally and faking indentation with paren-continuation--is what makes generator expressions and map and filter so nice.

Well, that, and the fact that in a comprehension I can just write an expression and it means that expression. I don't have to wrap the expression in a function, or try to come up with a higher-order expression that will effect that first-order expression when evaluated.

>>>> f(d)
> {'foo00': 'bar00',
> 'foo01': 'bar01',
> 'foo10': 'bar10',
> 'foo11': 'bar11',
> 'foo20': 'bar10',
> 'foo21': 'bar21'}
> 
> The combination of `compose`, `fmap`, and `Fmap` can be amazingly powerful for doing lots of work in a neat way while keeping the focus on the pipeline itself and not the individual values passing through.

But often, the individual values have useful names that make it easier to keep track of them. Like calling the keys and values k and v instead of having them be elements 0 and 1 of an implicit *args.

> The other thing is that this opens the door to a full "algebra" of maps which is kind of insane:
> 
> def mapeach(*fns):
>    def mapeach_(*xs): 
>        return list(map(lambda fn, *x: fn(*x), fns, *xs))
>    return mapeach_
> 
> def product_map(fns):
>    return lambda xs: list(map(lambda x: map(lambda fn: fn(x), fns), xs))
> 
> def smap(*fns):
>    "star map"
>    return lambda xs: list(map(O(*fns),*xs))
> 
> def pmap(*fns):
>    return lambda *xs: list(map(lambda *x:list(map(lambda fn:fn(*x),fns)),*xs))
> 
> def matrix_map(*_fns):
>    def matrix_map_(*_xs):
>        return list(map(lambda fns, xs: list(map(lambda fn, x: fmap(fn)(x), fns, xs)), _fns, _xs))
>    return matrix_map_
> 
> def mapcat(*fn):
>    "clojure-inspired?"
>    return compose(fmap(*fn), freduce(list.__add__))
> 
> def filtercat(*fn):
>    return compose(ffilter(*fn), freduce(list.__add__))
> 
> I rarely use any of these of these. They grew out of an attempt to tease out some hidden structure behind the combination of `map` and star packing/unpacking.
> 
> I do think there's something there but the names get in the way--it would be better to find a way to define a function that takes a specification of the structures of functions and values and knows what to do, e.g. something like
> 
>>>> from types import FunctionType
>>>> fn = FunctionType
>>>> # then the desired/imaginary version of map...
>>>> _map(fn, [int])(add(1))(range(5)) # sort of like `fmap`
> [1,2,3,4,5]
>>>> _map([fn], [int])((add(x) for x in range(5)))(range(5)) # sort of like `mapeach`
> [0,2,4,6,8]
>>>> _map([[fn]], [[int]])(((add(x) for x in range(5))*10))((list(range(5)))*10) # sort of like `matrix_map`
> [[[0, 1, 2, 3, 4],
>  [1, 2, 3, 4, 5],
>  [2, 3, 4, 5, 6],
>  [3, 4, 5, 6, 7],
>  [4, 5, 6, 7, 8],
>  [0, 1, 2, 3, 4],
>  [1, 2, 3, 4, 5],
>  [2, 3, 4, 5, 6],
>  [3, 4, 5, 6, 7],
>  [4, 5, 6, 7, 8]]]
> 
> In most cases the first argument would just be `fn`, but it would be *really* nice to be able to do something like
> 
>>>> map(fn, [[int], [[int],[[str],[str]]]])
> 
> where all you need to do is give the schema and indicate which values to apply the function to. Giving the type would be an added measure, but passing `type` in the schema for unknowns should work just as well.
> ________________________________________
> From: Python-ideas <python-ideas-bounces+larocca=abiresearch.com at python.org> on behalf of Steven D'Aprano <steve at pearwood.info>
> Sent: Saturday, May 09, 2015 11:20 PM
> To: python-ideas at python.org
> Subject: Re: [Python-ideas] Function composition (was no subject)
> 
>> On Sun, May 10, 2015 at 03:51:38AM +0300, Koos Zevenhoven wrote:
>> 
>> Another way to deal with elementwise operations on iterables would be to
>> make a small, mostly backwards compatible change in map:
>> 
>> When map is called with just one argument, for instance map(square), it
>> would return a function that takes iterables and maps them element-wise.
>> 
>> Now it would be easier to use map in pipelines, for example:
>> 
>> rms = sqrt @ mean @ map(square)
> 
> Or just use a tiny helper function:
> 
> def vectorise(func):
>    return partial(map, func)
> 
> rms = sqrt @ mean @ vectorise(square)
> 
> 
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list