Re: [Python-ideas] Function composition (was no subject)
On 10.5.2015 2:28, Ivan Levkivskyi wrote:
functions. In other words I agree with Andrew that "elementwise" is a good match with compose, and what we really need is to "pipe" things that take a vector (or just an iterable) and return a vector (iterable).
So that probably a good place (in a potential future) for compose would be not functools but itertools. But indeed a good place to test this would be Numpy.
Another way to deal with elementwise operations on iterables would be to make a small, mostly backwards compatible change in map: When map is called with just one argument, for instance map(square), it would return a function that takes iterables and maps them element-wise. Now it would be easier to use map in pipelines, for example: rms = sqrt @ mean @ map(square) or values->map(square)->mean->sqrt() Or if the change in map is not popular, there could be something like functools.mapper(func) that does that. Or even something more crazy, like square.map(seq), so that square.map could be used in pipelines. -- Koos
On Sun, May 10, 2015 at 03:51:38AM +0300, Koos Zevenhoven wrote:
Another way to deal with elementwise operations on iterables would be to make a small, mostly backwards compatible change in map:
When map is called with just one argument, for instance map(square), it would return a function that takes iterables and maps them element-wise.
Now it would be easier to use map in pipelines, for example:
rms = sqrt @ mean @ map(square)
Or just use a tiny helper function: def vectorise(func): return partial(map, func) rms = sqrt @ mean @ vectorise(square) -- Steve
(Newcomer here.) I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well. The fn and pymonad modules implement infix composition functions through overloading but I've found this to be unworkable. For me, the ideal infix operator would simply be a space, with the composition wrapped in parentheses. So e.g.
(list str sorted)(range(10)) [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ',', ',', ',', ',', ',', ',', ',', ',', ',', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '[', ']']
I might be overlooking something, but it seems to me this would work with existing syntax and semantics and wouldn't conflict with anything else like operator overloading would. The only other place non-indentation level spaces are significant is with keywords which can't be re-assigned. So e.g. (yield from gen()) wouldn't be parsed as 3 functions, and (def func) would raise SyntaxError. Here's the composition function I'm working with, stripped of the little debugging helpers: ``` def compose(*fns): def compose_(*x): fn, *fns = fns value = fn(*x) if fns: return compose(*fns)(value) else: return value return compose_ O=compose ``` I haven't had any issues with the recursion. The `O` alias rubs me the wrong way but seemed to make sense at the time. The thought was that it should look like an operator because it acts like one. So the use looks like
O(fn1, fn2, fn3, ...)('string to be piped')
The problem for composition is essentially argument passing and has to do with the convenience of *args, **kwargs. The way to make composition work predictably is to curry the functions yourself, wrapping the arguments you expect to get with nested closures, then repairing the __name__ etc with functools.wraps or update_wrapper in the usual way. This looks much nicer and almost natural when you write it with lambdas, e.g.
getitem = lambda item: lambda container: container[item]
(Apologies for having named that lambda there...) The other way to manage passing values from one function to the next is to define a function like def star(x): return lambda fn: fn(*x) Then if you get a list at one point in the pipeline and your function takes *args, you can decorate the function and call it like
star(getattr)((getattr, '__name__')) 'getattr'
I've run into problems using the @curried decorators from the fn and pymonad modules because they don't how to handle *args, i.e. when to stop collecting arguments and finally make the function call. If you want to have the composition order reversed you could decorate the definition with ``` def flip(f): def flip_(*x): f(*reversed(x)) return flip_ ``` Once we have composition we can write partials for `map`, `filter`, and `reduce`, but with a small twist: make them variadic in the first argument and pass the arguments to compose: def fmap(*fn): def fmap_(x): return list(map(compose(*fn),x)) return fmap_ def ffilter(fn): def ffilter_(xs): return list(filter(fn, xs)) return ffilter_ def freduce(fn): def _freduce(xs): return reduce(fn, xs) return _freduce def Fmap(*fns): def Fmap_(x): return list(map(lambda fn:fn(x), fns)) return Fmap_ The `Fmap` function seemed like some sort of "conjugate" to `fmap` so I tried to give it name suggesting this (again, at the expense of abusing naming conventions). Instead of mapping a function over a iterable like `fmap`, `Fmap` applies a each given function to a value. So
Fmap(add(1), sub(1))(1) [2, 0]
I've called them `fmap`, `ffilter`, and `freduce` but don't much like these names as they imply they might be the same as Haskell's `fmap`, and they're not. And there's no way to make them anything like Haskell as far as I can tell and they shouldn't be. If these implement a "paradigm" it's not purely functional but tacit/concatenative. It made sense to compose the passed arguments because there's no reason to pass anything else to `fmap` in the first call. So sequential calls to (the return value of) `fmap` inside a pipeline, like
O(mul(10), ... fmap(add(1)), ... fmap(mul(2)) ... )([1]) [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
can instead be written like
O(mul(10), ... fmap(add(1), ... mul(2)) ... )([1]) [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
It also makes it easier to work at different levels inside nested structures. In these heavily nested cases the composition pipeline even begins to resemble the data structure passing through, which makes sense. As another example, following is part of a pipeline that takes strings of bullet-separated strings of "key:value" pairs and converts each one to a dictionary, then folds the result together:
d = [' foo00 : bar00 • foo01 : bar01 ', ... ' foo10 : bar10 • foo11 : bar11 ', ... ' foo20 : bar10 • foo21 : bar21 ',]
dict_foldl = freduce(lambda d1, d2: dict(d1, **d2)) strip = lambda x: lambda s: s.strip(x) split = lambda x: lambda s: s.split(x)
f = O(fmap(strip(' '), ... split('•'), ... fmap(split(':'), ... strip(' '), ... tuple), ... tuple, ... dict), ... dict_foldl)
f(d) {'foo00': 'bar00', 'foo01': 'bar01', 'foo10': 'bar10', 'foo11': 'bar11', 'foo20': 'bar10', 'foo21': 'bar21'}
The combination of `compose`, `fmap`, and `Fmap` can be amazingly powerful for doing lots of work in a neat way while keeping the focus on the pipeline itself and not the individual values passing through. The other thing is that this opens the door to a full "algebra" of maps which is kind of insane: def mapeach(*fns): def mapeach_(*xs): return list(map(lambda fn, *x: fn(*x), fns, *xs)) return mapeach_ def product_map(fns): return lambda xs: list(map(lambda x: map(lambda fn: fn(x), fns), xs)) def smap(*fns): "star map" return lambda xs: list(map(O(*fns),*xs)) def pmap(*fns): return lambda *xs: list(map(lambda *x:list(map(lambda fn:fn(*x),fns)),*xs)) def matrix_map(*_fns): def matrix_map_(*_xs): return list(map(lambda fns, xs: list(map(lambda fn, x: fmap(fn)(x), fns, xs)), _fns, _xs)) return matrix_map_ def mapcat(*fn): "clojure-inspired?" return compose(fmap(*fn), freduce(list.__add__)) def filtercat(*fn): return compose(ffilter(*fn), freduce(list.__add__)) I rarely use any of these of these. They grew out of an attempt to tease out some hidden structure behind the combination of `map` and star packing/unpacking. I do think there's something there but the names get in the way--it would be better to find a way to define a function that takes a specification of the structures of functions and values and knows what to do, e.g. something like
from types import FunctionType fn = FunctionType # then the desired/imaginary version of map... _map(fn, [int])(add(1))(range(5)) # sort of like `fmap` [1,2,3,4,5] _map([fn], [int])((add(x) for x in range(5)))(range(5)) # sort of like `mapeach` [0,2,4,6,8] _map([[fn]], [[int]])(((add(x) for x in range(5))*10))((list(range(5)))*10) # sort of like `matrix_map` [[[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8], [0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]]]
In most cases the first argument would just be `fn`, but it would be *really* nice to be able to do something like
map(fn, [[int], [[int],[[str],[str]]]])
where all you need to do is give the schema and indicate which values to apply the function to. Giving the type would be an added measure, but passing `type` in the schema for unknowns should work just as well. ________________________________________ From: Python-ideas <python-ideas-bounces+larocca=abiresearch.com@python.org> on behalf of Steven D'Aprano <steve@pearwood.info> Sent: Saturday, May 09, 2015 11:20 PM To: python-ideas@python.org Subject: Re: [Python-ideas] Function composition (was no subject) On Sun, May 10, 2015 at 03:51:38AM +0300, Koos Zevenhoven wrote:
Another way to deal with elementwise operations on iterables would be to make a small, mostly backwards compatible change in map:
When map is called with just one argument, for instance map(square), it would return a function that takes iterables and maps them element-wise.
Now it would be easier to use map in pipelines, for example:
rms = sqrt @ mean @ map(square)
Or just use a tiny helper function: def vectorise(func): return partial(map, func) rms = sqrt @ mean @ vectorise(square) -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, May 10, 2015 at 04:58:29AM +0000, Douglas La Rocca wrote:
(Newcomer here.)
I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well.
Thanks for the well-thought out and very detailed post! The concrete experience you bring to this discussion is a welcome change from all the theoretical "wouldn't it be nice (or awful) if ..." from many of us, and I include myself. The fact that you have extensive experience with using function composition in practice, and can point out the benefits and disadvantages, is great. -- Steve
Thanks! Not sure what took me so long to get on the python lists, but I finally did and to my excitement you were talking about my favorite topic! --- For replacing the need to write `lambda x: x...` inside compositions *in a limited set of cases*, you could use a sort of "doppelganger" type/metaclass: class ThisType(type): def __getattr__(cls, attr): def attribute(*args, **kwargs): def method(this): this_attr = getattr(this, attr) if callable(this_attr): return this_attr(*args, **kwargs) else: return this_attr return method return attribute def __call__(cls, *args, **kwargs): def decorator(fn): return fn(*args, **kwargs) return decorator def __getitem__(cls, item): return lambda x: x[item] class this(metaclass=ThisType): pass Basically, it records whatever is done to it, then returns a function that takes a value and does those things to the value. So any call, __getattr__ with arguments, and __getitem__ you'd want to do with a value mid-pipe would be staged or set up by doing them to `this`. So rather than writing
compose(lambda s: s.strip('<>'), lambda s: s.lower())('<HTML>')
you can write
compose(this.strip('<>'), this.lower())('<HTML>') 'html'
or
compose(float, this.__str__)('1') '1.0'
But there are two caveats: Property attributes would need to be *called*, which feels weird when you already know an API well, so e.g.
from lxml import html html.fromstring('<b>bold text</b>').text 'bold text' compose(html.fromstring, this.text())('<b>bold text</b>') 'bold text'
It's also a bit weird because attributes that return functions/methods/callables *aren't* called (like above with `this.__str__`: `__str__` is a method of `float`). Second caveat is that nothing past the __getitem__ and __getattr__ will work, so e.g.
from pandas import DataFrame df = DataFrame([1]*2, columns=['A','B']) A B 0 1 1 1 1 1 compose(this.applymap(str), this['A'])(df) 0 1 1 1 Name: A, dtype: object compose(this.applymap(str), this['A'], this.shape())(df) (2,)
...but...
compose(this.applymap(str), this['A'].shape)(df) AttributeError: 'function' object has no attribute 'shape'
________________________________________ From: Python-ideas <python-ideas-bounces+larocca=abiresearch.com@python.org> on behalf of Steven D'Aprano <steve@pearwood.info> Sent: Sunday, May 10, 2015 2:01 AM To: python-ideas@python.org Subject: Re: [Python-ideas] Function composition (was no subject) On Sun, May 10, 2015 at 04:58:29AM +0000, Douglas La Rocca wrote:
(Newcomer here.)
I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well.
Thanks for the well-thought out and very detailed post! The concrete experience you bring to this discussion is a welcome change from all the theoretical "wouldn't it be nice (or awful) if ..." from many of us, and I include myself. The fact that you have extensive experience with using function composition in practice, and can point out the benefits and disadvantages, is great. -- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On May 9, 2015, at 21:58, Douglas La Rocca <larocca@abiresearch.com> wrote:
(Newcomer here.)
I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well.
The fn and pymonad modules implement infix composition functions through overloading but I've found this to be unworkable.
For me, the ideal infix operator would simply be a space, with the composition wrapped in parentheses. So e.g.
(list str sorted)(range(10)) [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ',', ',', ',', ',', ',', ',', ',', ',', ',', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '[', ']']
I might be overlooking something, but it seems to me this would work with existing syntax and semantics and wouldn't conflict with anything else like operator overloading would. The only other place non-indentation level spaces are significant is with keywords which can't be re-assigned. So e.g. (yield from gen()) wouldn't be parsed as 3 functions, and (def func) would raise SyntaxError.
Here's the composition function I'm working with, stripped of the little debugging helpers:
``` def compose(*fns): def compose_(*x): fn, *fns = fns value = fn(*x) if fns: return compose(*fns)(value) else: return value return compose_
O=compose ```
I haven't had any issues with the recursion. The `O` alias rubs me the wrong way but seemed to make sense at the time. The thought was that it should look like an operator because it acts like one.
So the use looks like
O(fn1, fn2, fn3, ...)('string to be piped')
The problem for composition is essentially argument passing and has to do with the convenience of *args, **kwargs.
The way to make composition work predictably is to curry the functions yourself, wrapping the arguments you expect to get with nested closures, then repairing the __name__ etc with functools.wraps or update_wrapper in the usual way. This looks much nicer and almost natural when you write it with lambdas, e.g.
getitem = lambda item: lambda container: container[item]
(Apologies for having named that lambda there...)
I understand why you named it; I don't understand why you didn't just use def if you were going to name it (and declare it in a statement instead of the middle of an expression). Anyway, this is already in operator, as itemgetter, and it's definitely useful to functional code, especially itertools-style generator-driven functional code. And it feels like the pattern ought to be generalizable... but other than attrgetter, it's hard to think of another example where you want the same thing. After all, Python only has a couple of syntactic forms that you'd want to wrap up as functions at all, so it only has a couple of syntactic forms that you'd want to wrap up as curried functions.
The other way to manage passing values from one function to the next is to define a function like
def star(x): return lambda fn: fn(*x)
Then if you get a list at one point in the pipeline and your function takes *args, you can decorate the function and call it like
star(getattr)((getattr, '__name__')) 'getattr'
I've run into problems using the @curried decorators from the fn and pymonad modules because they don't how to handle *args, i.e. when to stop collecting arguments and finally make the function call.
If you want to have the composition order reversed you could decorate the definition with
``` def flip(f): def flip_(*x): f(*reversed(x)) return flip_ ```
Once we have composition we can write partials for `map`, `filter`, and `reduce`, but with a small twist: make them variadic in the first argument and pass the arguments to compose:
def fmap(*fn): def fmap_(x): return list(map(compose(*fn),x)) return fmap_
I don't understand why this is called fmap. I see below that you're not implying anything like Haskell's fmap (which confused me...), but then what _does_ the f mean? It seems like this is just a manually curried map, that returns a list instead of an iterator, and only takes one iterable instead of one or more. None of those things say "f" to me, but maybe I'm still hung up on expecting it to mean "functor" and I'll feel like an idiot once you clear it up. :) Also, why _is_ it calling list? Do your notions of composition and currying not play well with iterators? If so, that seems like a pretty major thing to give up. And why isn't it variadic in the iterables? You can trivially change that by just having the wrapped function take and pass *x, but I assume there's some reason you didn't?
def ffilter(fn): def ffilter_(xs): return list(filter(fn, xs)) return ffilter_
def freduce(fn): def _freduce(xs): return reduce(fn, xs) return _freduce
These two aren't variadic in fn like fmap was. Is that just a typo, or is there a reason not to be?
def Fmap(*fns): def Fmap_(x): return list(map(lambda fn:fn(x), fns)) return Fmap_
The `Fmap` function seemed like some sort of "conjugate" to `fmap` so I tried to give it name suggesting this (again, at the expense of abusing naming conventions).
Instead of mapping a function over a iterable like `fmap`, `Fmap` applies a each given function to a value. So
Fmap(add(1), sub(1))(1) [2, 0]
I've called them `fmap`, `ffilter`, and `freduce` but don't much like these names as they imply they might be the same as Haskell's `fmap`, and they're not. And there's no way to make them anything like Haskell as far as I can tell and they shouldn't be. If these implement a "paradigm" it's not purely functional but tacit/concatenative.
It made sense to compose the passed arguments because there's no reason to pass anything else to `fmap` in the first call. So sequential calls to (the return value of) `fmap` inside a pipeline, like
O(mul(10), ... fmap(add(1)), ... fmap(mul(2)) ... )([1]) [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
can instead be written like
O(mul(10), ... fmap(add(1), ... mul(2)) ... )([1]) [4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
It also makes it easier to work at different levels inside nested structures. In these heavily nested cases the composition pipeline even begins to resemble the data structure passing through, which makes sense.
As another example, following is part of a pipeline that takes strings of bullet-separated strings of "key:value" pairs and converts each one to a dictionary, then folds the result together:
d = [' foo00 : bar00 • foo01 : bar01 ', ... ' foo10 : bar10 • foo11 : bar11 ', ... ' foo20 : bar10 • foo21 : bar21 ',]
dict_foldl = freduce(lambda d1, d2: dict(d1, **d2)) strip = lambda x: lambda s: s.strip(x) split = lambda x: lambda s: s.split(x)
f = O(fmap(strip(' '), ... split('•'), ... fmap(split(':'), ... strip(' '), ... tuple), ... tuple, ... dict), ... dict_foldl)
Now that we have a concrete example... This looks like a nifty translation of what you might write in Haskell, but it doesn't look at all like Python to me. And compare: def f(d): pairs = (pair.strip(' ').split(':') for pair in d.split('•')) strippedpairs = ((part.strip(' ') for part in pair) for pair in pairs) return dict(strippedpairs) Or, even better: def f(d): pairs = (pair.strip(' ').split(':') for pair in d.split('•')) return {k.strip(' '): v.strip(' ') for k, v in pairs} Of course I skipped a lot of steps--turning the inner iterables into tuples, then into dicts, then turning the outer iterable into a list, then merging all the dicts, and of course wrapping various subsets of the process up into functions and calling them--but that's because those steps are unnecessary. We have comprehensions, we have iterators, why try to write for Python 2.2? And notice that any chain of iterator transformations like this _could_ be written as a single expression. But the fact that it doesn't _have_ to be--that you can take any step you want and name the intermediate iterable without having to change anything (and with negligible performance cost), and you can make your code vertical and play into Python indentation instead of writing it horizontally and faking indentation with paren-continuation--is what makes generator expressions and map and filter so nice. Well, that, and the fact that in a comprehension I can just write an expression and it means that expression. I don't have to wrap the expression in a function, or try to come up with a higher-order expression that will effect that first-order expression when evaluated.
f(d) {'foo00': 'bar00', 'foo01': 'bar01', 'foo10': 'bar10', 'foo11': 'bar11', 'foo20': 'bar10', 'foo21': 'bar21'}
The combination of `compose`, `fmap`, and `Fmap` can be amazingly powerful for doing lots of work in a neat way while keeping the focus on the pipeline itself and not the individual values passing through.
But often, the individual values have useful names that make it easier to keep track of them. Like calling the keys and values k and v instead of having them be elements 0 and 1 of an implicit *args.
The other thing is that this opens the door to a full "algebra" of maps which is kind of insane:
def mapeach(*fns): def mapeach_(*xs): return list(map(lambda fn, *x: fn(*x), fns, *xs)) return mapeach_
def product_map(fns): return lambda xs: list(map(lambda x: map(lambda fn: fn(x), fns), xs))
def smap(*fns): "star map" return lambda xs: list(map(O(*fns),*xs))
def pmap(*fns): return lambda *xs: list(map(lambda *x:list(map(lambda fn:fn(*x),fns)),*xs))
def matrix_map(*_fns): def matrix_map_(*_xs): return list(map(lambda fns, xs: list(map(lambda fn, x: fmap(fn)(x), fns, xs)), _fns, _xs)) return matrix_map_
def mapcat(*fn): "clojure-inspired?" return compose(fmap(*fn), freduce(list.__add__))
def filtercat(*fn): return compose(ffilter(*fn), freduce(list.__add__))
I rarely use any of these of these. They grew out of an attempt to tease out some hidden structure behind the combination of `map` and star packing/unpacking.
I do think there's something there but the names get in the way--it would be better to find a way to define a function that takes a specification of the structures of functions and values and knows what to do, e.g. something like
from types import FunctionType fn = FunctionType # then the desired/imaginary version of map... _map(fn, [int])(add(1))(range(5)) # sort of like `fmap` [1,2,3,4,5] _map([fn], [int])((add(x) for x in range(5)))(range(5)) # sort of like `mapeach` [0,2,4,6,8] _map([[fn]], [[int]])(((add(x) for x in range(5))*10))((list(range(5)))*10) # sort of like `matrix_map` [[[0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8], [0, 1, 2, 3, 4], [1, 2, 3, 4, 5], [2, 3, 4, 5, 6], [3, 4, 5, 6, 7], [4, 5, 6, 7, 8]]]
In most cases the first argument would just be `fn`, but it would be *really* nice to be able to do something like
map(fn, [[int], [[int],[[str],[str]]]])
where all you need to do is give the schema and indicate which values to apply the function to. Giving the type would be an added measure, but passing `type` in the schema for unknowns should work just as well. ________________________________________ From: Python-ideas <python-ideas-bounces+larocca=abiresearch.com@python.org> on behalf of Steven D'Aprano <steve@pearwood.info> Sent: Saturday, May 09, 2015 11:20 PM To: python-ideas@python.org Subject: Re: [Python-ideas] Function composition (was no subject)
On Sun, May 10, 2015 at 03:51:38AM +0300, Koos Zevenhoven wrote:
Another way to deal with elementwise operations on iterables would be to make a small, mostly backwards compatible change in map:
When map is called with just one argument, for instance map(square), it would return a function that takes iterables and maps them element-wise.
Now it would be easier to use map in pipelines, for example:
rms = sqrt @ mean @ map(square)
Or just use a tiny helper function:
def vectorise(func): return partial(map, func)
rms = sqrt @ mean @ vectorise(square)
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/ _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
I understand why you named it; I don't understand why you didn't just use def if you were going to name it (and declare it in a statement instead of the middle of an expression). Anyway, this is already in operator, as itemgetter, and it's definitely useful to functional code, especially itertools-style generator-driven functional code. And it feels like the pattern ought to be generalizable... but other than attrgetter, it's hard to think of another example where you want the same thing. After all, Python only has a couple of syntactic forms that you'd want to wrap up as functions at all, so it only has a couple of syntactic forms that you'd want to wrap up as curried functions.
Sorry for the confusion here--I was trying to say that it's correct to use def in order to properly set __name__, give space for doc strings, etc. The downside is that the nesting can strain readability. I was only showing the "formal" equivalent in lambda-style to point out how currying arguments isn't really confusing at all, considering lambda x: lambda y: lambda z: <some expression of x, y, z> begins to resemble syntactically def anon(x, y, z): <some expression of x, y, z> (obviously semantically these are different). Regarding the `getitem` example, this wasn't intended as a use-case. It's true Python has few syntactic forms you'd want to wrap (isinstance, hasattr, etc.). I mostly had external module apis in mind here.
I don't understand why this is called fmap. I see below that you're not implying anything like Haskell's fmap (which confused me...), but then what _does_ the f mean? It seems like this is just a manually curried map, that returns a list instead of an iterator, and only takes one iterable instead of one or more. None of those things say "f" to me, but maybe I'm still hung up on expecting it to mean "functor" and I'll feel like an idiot once you clear it up. :)
Also, why _is_ it calling list? Do your notions of composition and currying not play well with iterators? If so, that seems like a pretty major thing to give up. And why isn't it variadic in the iterables? You can trivially change that by just having the wrapped function take and pass *x, but I assume there's some reason you didn't?
It was only called fmap to leave the builtin map in the namespace, the 'f' just meant 'function'. Taking a single iterable as the first item rather than varargs avoids the use of the `star` shim in the composition. I do use a wrapper `s` for this but find it ugly to use. It's basically a conventional decision that's forced by the difference between passing a single value to a "monadic" (in the APL not Haskell sense) function and a variadic function. In my own util library this also shows up as two versions of the identity function: def identity(x): return x def identity_star(*x): return x It will seem these are useless but purpose becomes felt when you're in the middle of a composition. For data structures where you want to map over lists of lists of lists etc., you can either define a higher map or do something like fmap(fmap(fmap(function_to_apply)))(iterable) which would incidentally be the same as the uglier compose(*(fmap,)*3)(function_to_apply)(iterable) though the latter makes it possible to parametrize the iteration depth. As for wrapping in `list`--in some cases (I can't immediately recall them all) the list actually needed to be built in order for the composition to work. A simple case would be compose(mul(10), fmap(len), len)([[1]*10]*10) which would return TypeError. I should look again to see if there's a better way to fix it. But I reverted the default back to 2.x because I made full use of generators before moving to 3.x and decided I didn't need map to be lazy. To be honest, the preference for everything to be lazy seems somewhat fashionable at the moment... you can get along just as well knowing where things shouldn't be fully loaded into memory (i.e. when to use a generator).
These two aren't variadic in fn like fmap was. Is that just a typo, or is there a reason not to be?
Yes just a typo!
Now that we have a concrete example... This looks like a nifty translation of what you might write in Haskell, but it doesn't look at all like Python to me.
And compare:
def f(d): pairs = (pair.strip(' ').split(':') for pair in d.split('•')) strippedpairs = ((part.strip(' ') for part in pair) for pair in pairs) return dict(strippedpairs)
Or, even better:
def f(d): pairs = (pair.strip(' ').split(':') for pair in d.split('•')) return {k.strip(' '): v.strip(' ') for k, v in pairs}
Of course I skipped a lot of steps--turning the inner iterables into tuples, then into dicts, then turning the outer iterable into a list, then merging all the dicts, and of course wrapping various subsets of the process up into functions and calling them--but that's because those steps are unnecessary. We have comprehensions, we have iterators, why try to write for Python 2.2?
I agree these work just as well.
And notice that any chain of iterator transformations like this _could_ be written as a single expression. But the fact that it doesn't _have_ to be--that you can take any step you want and name the intermediate iterable without having to change anything (and with negligible performance cost), and you can make your code vertical and play into Python indentation instead of writing it horizontally and faking indentation with paren-continuation--is what makes generator expressions and map and filter so nice.
Well, that, and the fact that in a comprehension I can just write an expression and it means that expression. I don't have to wrap the expression in a function, or try to come up with a higher-order expression that will effect that first-order expression when evaluated.
But often, the individual values have useful names that make it easier to keep track of them. Like calling the keys and values k and v instead of having them be elements 0 and 1 of an implicit *args.
I agree for the most part, but there are cases where you're really deep into some structure, manipulating the values in a generic way, and the names *do* get in the way. The temptation for me in those cases is to use x, y, z, s, t, etc. At this point the readability really suffers. The alternative is to modularize more, breaking the functions apart, but this only helps so much... In a certain way I find `(pair.strip(' ').split(':') for pair in d.split('•'))` to be less readable than the first steps in the composition--with the generator I'm reading back and forth in order to find out what's happening whereas the composition + map outlines the steps in a tree-like structure.
On Sun, May 10, 2015 at 2:58 PM, Douglas La Rocca <larocca@abiresearch.com> wrote:
(Newcomer here.)
Welcome to Bikeshed Central! Here, we take a plausible idea and fiddle around with all the little detaily bits :)
For me, the ideal infix operator would simply be a space, with the composition wrapped in parentheses. So e.g.
(list str sorted)(range(10)) [' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ',', ',', ',', ',', ',', ',', ',', ',', ',', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '[', ']']
I might be overlooking something, but it seems to me this would work with existing syntax and semantics and wouldn't conflict with anything else like operator overloading would.
One of the problems with using mere whitespace is that it's very easy to do accidentally. There's already places where this can happen, for instance: strings = [ "String one", "String two is a bit longer" "String three", "String four" ] How many strings are there in my list? Clearly the programmer's intention is to have four, but that's not what ends up happening. (Now imagine there are actually hundreds of strings, and one gets selected at random every time you do something. Have fun figuring out why, just occasionally, it prints out two messages instead of one. For bonus points, figure that out when there are two or three such bugs in the list, so it's not always the exact same pair that come out together.) At the moment, we can safely build up a list of functions like this: funcs = [ list, str, sorted, ] because omitting a comma will produce an instant SyntaxError. Python currently is pretty good at detecting problems in source code. (Not all languages are, as you'll know as soon as you run into one of those "oops I left out a semicolon and my JavaScript function does something slightly different" bugs.) Part of that comes from having a fairly simple set of rules governing syntax, such that any deviation results in a simple and quick error *at or very near to* the place where the error occurs. You won't, for instance, get an error at the bottom of a file saying "Unmatched '{' or missing '}'", leaving you to dig through your code to figure out exactly where the problem was. At worst, you get an error on the immediately-following line of code: def func1(): value = x * (y + z # oops, forgot the close parens print(value) # boom, SyntaxError on this line But if "function function" meant composition, this would actually be legal, and you'd get an error rather further down. If you're lucky, this is the end of this function, and the "def" keyword trips the error; but otherwise, this would be validly parsed as "compose z and print into a function, then call that with value", and we're still looking for a close parens. So I would strongly suggest having some sort of operator in between. Okay. Can I just say something crazy? (Hans: I love crazy!) How about using a comma?
(fn1, fn2, fn3, ...)('string to be piped')
Currently, this produces a runtime TypeError: 'tuple' object is not callable, but I could easily define my own callable subclass of tuple.
class functuple(tuple): ... def __call__(self, arg): ... for func in self: arg = func(arg) ... return arg ... f = functuple((fn1,fn2)) f("this is a test")
(Use whatever semantics you like for handling multiple arguments. I'm not getting into that part of the debate, as I have no idea how function composition ought to work in the face of *args and **kwargs.) The syntax is reasonably clean, and it actually doesn't require many changes - just making tuples callable in some logical fashion. No new syntax needed, and it's an already-known light-weight way to pack up a bunch of things into one object. Does it make sense to do this? ChrisA
On Sun, May 10, 2015 at 07:25:31PM +1000, Chris Angelico wrote:
strings = [ "String one", "String two is a bit longer" "String three", "String four" ]
How many strings are there in my list? Clearly the programmer's intention is to have four, but that's not what ends up happening. (Now imagine there are actually hundreds of strings, and one gets selected at random every time you do something.
If you are embedding hundreds of strings in the source, instead of reading them from a file, you deserve whatever horribleness you get :-)
Have fun figuring out why, just occasionally, it prints out two messages instead of one.
That would actually be pretty easy to solve. When you get the unexpected "String two is a bit longerString three" message, just grep through the file for the first few words, and lo and behold, you are missing a comma. But your point about syntactically meaningful whitespace is otherwise a good one. Python doesn't give whitespace in expressions any particular meaning, except as a separator. I'd be very dubious about making function composition an exception.
So I would strongly suggest having some sort of operator in between. Okay. Can I just say something crazy? (Hans: I love crazy!) How about using a comma?
(fn1, fn2, fn3, ...)('string to be piped')
Currently, this produces a runtime TypeError: 'tuple' object is not callable, but I could easily define my own callable subclass of tuple.
There's lots of code that assumes that a tuple of functions is a sequence: for f in (len, str, ord, chr, repr): test(f) so we would need to keep that. But we don't want a composed function to be a sequence, any more than we want a partial or a regular function to be sequences. If I pass you a Composed object, and you try slicing it, that should be an error. -- Steve
On Sun, May 10, 2015 at 7:57 PM, Steven D'Aprano <steve@pearwood.info> wrote:
So I would strongly suggest having some sort of operator in between. Okay. Can I just say something crazy? (Hans: I love crazy!) How about using a comma?
(fn1, fn2, fn3, ...)('string to be piped')
Currently, this produces a runtime TypeError: 'tuple' object is not callable, but I could easily define my own callable subclass of tuple.
There's lots of code that assumes that a tuple of functions is a sequence:
for f in (len, str, ord, chr, repr): test(f)
so we would need to keep that. But we don't want a composed function to be a sequence, any more than we want a partial or a regular function to be sequences. If I pass you a Composed object, and you try slicing it, that should be an error.
Well, I told you it was crazy :) But the significance here is that there would be no Composed object, just a tuple. You could slice it, iterate over it, etc; and if you call it, it calls each of its arguments. I'm not sure that it's a fundamental problem for a composed function to be sliceable, any more than it's a problem for any other available operation that you aren't using. Tuples already have several related uses (they can be used as "record" types, or as frozen lists for hashability, etc), and this would simply mean that a tuple of callables is callable. ChrisA
On 05/10/2015 05:57 AM, Steven D'Aprano wrote:
There's lots of code that assumes that a tuple of functions is a sequence:
for f in (len, str, ord, chr, repr): test(f)
so we would need to keep that. But we don't want a composed function to be a sequence, any more than we want a partial or a regular function to be sequences. If I pass you a Composed object, and you try slicing it, that should be an error.
It seems to me a linked list of composed objects works (rather than a sequence). It's easier to understand what is going on in it. from functools import partial from operator import * from statistics import mean def root(x): return x ** .5 def square(x): return x ** 2 class CF: def __init__(self, f, *rest): if isinstance(f, tuple): self.f = partial(*f) else: self.f = f if rest: self.child = CF(*rest) else: self.child = None def __call__(self, data): if self.child == None: return self.f(data) return self.f(self.child(data)) def __repr__(self): if self.child != None: s = repr(self.child) else: s = "CS()" return s[:3] + ("%s, " % repr(self.f)) + s[3:] CF(print, root, mean, (map, square)) ([4, 9, 16]) Prints: 10.847426730181986
participants (6)
-
Andrew Barnert
-
Chris Angelico
-
Douglas La Rocca
-
Koos Zevenhoven
-
Ron Adam
-
Steven D'Aprano