[Python-ideas] Elixir inspired pipe to apply a series of functions

Mon Jun 17 11:39:08 CEST 2013

On Sat, Jun 15, 2013 at 12:00 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
> From: Jan Wrobel <wrr at mixedbit.org>
>
> Sent: Thursday, June 13, 2013 11:06 AM
>
>
>> I've recently stumbled upon a Joe Armstrong's (of Erlang) blog post
>> that praises an Elixir pipe operator:
>>
>> http://joearms.github.io/2013/05/31/a-week-with-elixir.html
>>
>> The operator allows to nicely structure code that applies a series of
>> functions to transform an input value to some output.
>>
>> I often end up writing code like:
>>
>> pkcs7_unpad(
>>   reduce(lambda result, block: result.append(block),
>>     map(decrypt_block,
>>       pairwise([iv] + secret_blocks))))
>>
>> Which is dense, and needs to be read backwards (last operation is
>> written first), but as Joe notes, the alternative is also not very
>> compelling:
>>
>>   decrypted_blocks = map(decrypt_block, pairwise([iv] + secret_blocks))
>>   combined_blocks = reduce(lambda result, block: result.append(block))
>>   return pkcs7_unpad(combined_blocks)
>
> I don't see why some people think naming intermediate results makes things less readable. But, if you do, you can always give them short throwaway names like _ or x.
>
> Also, if you're concerned with readability, throwing in unnecessary lambdas doesn't exactly help. If you know the type of result, just use the unbound method; if you need it to be generic, you probably need it more than once, so write a named appender function. Also, it's very weird (and definitely not in the functional spirit you're going for) to call reduce on a function that mutates an argument and returns None, and I can't figure out what exactly you're trying to accomplish, but I'll ignore that.

This was just an example to illustrate a pattern ('result' is not a
list but an object of a custom class, for which append returns 'this'
to allow chaining, but this is not important).

>
> So:
>
> _ = map(decrypt_block, pairwise([iv] + secret_blocks))
> _ = reduce(list.append, _)
> return pkcs7_unpad(_)
>
> Is that really hard to understand?
>
> If you just want everything to be an expression… well, that's silly (the "return" shows that this is clearly already a function, and the function call will already be an expression no matter how you implement the internals)—but, more importantly, you're using the wrong language. Many of Python's readability strengths derive from the expression-statement divide and the corresponding clean statement syntax; if you spend all your time fighting that, you might be happier using a language that doesn't fight back.

I came to Python from C/C++, so initially my Python code was very
C-like. Gradually, I've learned to use more functional constructs,
which sometime require forcing and fighting, but most of the time, the
outcome is positive. Writing two nested loops for anyone with C
background is often the most natural and fastest solution. It is often
a pain to break such code into separate steps/functions and combine
results with some higher-order function, but it is worth the trouble.
I don't have an impression that Python fights back and promotes
imperative style.

> But Python does actually have a way to write things like this in terms of expressions. Just use a comprehension or generator expression instead of calling map and friends. When you're mapping a pre-existing function over an iterator with no filtering or anything else going on, map is great; when you want to map an expression that's hard to describe as a function, use s comprehension. (And when you want to iterate mutating code, don't use either.) That's the same rule of thumb people use in Haskell, so it would be hard to argue that it's not "functional" enough.
>
> Meanwhile, most of what you want is just a reverse-compose operator and a partial operator, so you can write in reverse point-free style. Let's see it without operators first:
>
>     def compose(f1, f2):
>         @wraps(f1)
>         def composed(arg):
>             return f1(f2(arg))
>
>     def rcompose(f1, f2):
>         return compose(f2, f1)
>
>     def rapply(arg, f):
>         return f(arg)
>
>     return rapply([iv] + secret_blocks,
>                    rcompose(partial(map, decrypt_block),
>                              rcompose(partial(reduce, list.append),
>                                        pkcs7_unpad)))
>
> Now call rcompose, compose, partial, and rapply, say, FunctionType.__lshift__, __rshift__, __getitem__, and __rmod__:
>
>     return ([iv] + secret_blocks]) % (map[decrypt_block] >> reduce[list.append] >> pkcs7_unpad)
>
> This looks nothing at all like Python, and it's far less readable than the three-liner version. It saves a grand total of 12/103 keystrokes. And of course it can't be implemented without significant changes to the function and builtin-function implementations.

If altering FunctionType was possible, probably the best option would
be to just define Function.__rrshift__ and use partial explicitly:

return ([iv] + secret_blocks]) >> partial(map, decrypt_block) >>
partial(reduce, list.append) >> pkcs7_unpad

>> I'm not sure introducing pipes like this at the Python level would be
>
>> a good idea. Is there already a library level support for such
>> constructs?
>
> partial is in functools. compose is not, because it was considered so trivial that it wasn't worth adding ("anyone who wants this can build it faster than he can look it up"). rcompose is just as trivial. And a reverse-apply wrapper is almost as simple.
>
>> If not, what would be a good way to express them? I've
>> tried a bit an figured out a following API
>> (https://gist.github.com/wrr/5775808):
>>
>>        Pipe([iv] + secret_blocks)\
>>         (pairwise)\
>>         (map, decrypt_block)\
>>         (reduce, lambda result, block: result.append(block))\
>>         (pkcs7_unpad)\
>>         ()
>
> The biggest problem here is that the model isn't clear without thinking about it. If you're going to use classes, think about it in OO terms: what object in your mental model does a Pipe represent? It's sort of an applicator with partial currying. Is there a simpler model that you could use? Sure: functions.

But pure functions based API (without language level support) requires
nesting. As in your example:

rapply([iv] + secret_blocks,
     rcompose(partial(map, decrypt_block),
         rcompose(partial(reduce, list.append),
                         pkcs7_unpad)))

The order is right, but the last operation is nested in all previous
operations. A unix-like pipe:

... | decrypt_blocks | pkcs7_unpad

has clean, flat structure. Each operation takes input from a single
previous operation, so it shouldn't be nested in all previous
operations.

> In a function language, I think people would either write this in normal point-free style:
>
>     map decrypt_block . reduce append . pkcs7_unpad $ [iv] + secret_blocks
>
>
> … or as an explicit chain of reverse-applies:
>
>     [iv] + secret_blocks :- pkcs7_unpad :- (reduce append) :- (map decrypt_block)

I was actually trying to achieve this. Pipe(foo) applies foo to an
existing result of a pipe and returns `this`, it doesn't first compose
all functions and then apply them all to an input value.

> … rather than the reverse point-free you're going for:
>
>     import Control.Arrow
>     [iv] + secret_blocks :- (pkcs7_unpad >>> (reduce append) >>> (map decrypt_block))
>
>
> And part of the reason for that is that the normal point-free version is blatantly obviously just defining a normal function and then calling it. In fact, the language—whether Haskell or Python—can even see that at the syntactic level. Instead of this (sorry for the hybrid syntax):
>
>     def decrypt(secret_blocks):
>         return map decrypt_block . reduce append . pkcs7_unpad $ [iv] + secret_blocks
>
> You can just do this:
>
>     decrypt = map decrypt_block . reduce append . pkcs7_unpad
>
> Also, the way you're hiding partialization makes it unclear what's going on at first read. Normally, people don't think of (map, decrypt_block) as meaning to call map with decrypt_block. That makes sense in Lisp (where that's what function calling already looks like) or in Haskell (where currying means partialization is always implicit), but not so much in Python, where it looks completely different from calling map with decrypt_block.

This is true. Elixir syntax allows to apply a function  [iv] +
secret_blocks |> pairwise() |> map(decrypt_block) which for Python
would also be natural, but which is not doable at the library level.

> Second, your code is significantly longer than the obvious Pythonic three-liner—even after replacing your unnecessary lambda, it's twice as many lines, more extraneous symbols, and more keystrokes.
>
> And it's clearly going to be harder to debug. If something goes wrong anywhere in the chain, it's going to be hard to tell where. Compare the traceback you'd get through a chain of Pipe.__call__ methods to what you'd get in the explicitly-sequenced version, where it goes right to the single-line statement where something went wrong.
>
> It also just looks ugly—backslash continuations, what look like (but aren't) unnecessary parens, etc.
>
>> The API is more verbose than the language level operator. I initially
>> tried to overload `>>`, but it doesn't allow for additional arguments.
>
> If you got rid of the implicit partials, you could use it:
>
> (Pipe([iv] + secret_blocks) >>
>      pairwise >>
>      partial(map, decrypt_block) >>
>      partial(reduce, list,append) >>
>      pkcs7_unpad)()
>
> It's a lot less ugly this way. But I definitely wouldn't use it.
>
> And if you used it, and I had to read your code, I'd have to either reason it through, or translate it in my head to Haskell (where I could reason it through more quickly and figure out what you're really up to), rather than just reading it and understanding it.

Thank you for the very informative response. I was not aware of
partial(), which is very useful.

Readability is a tricky concept, because familiar constructs will be
always more readable than anything new. I  must agree though, that for
an established language like Python, introducing a new construct for
such a basic think could be confusing, and overall readability outcome
could be negative.

I still like the pipe operator (Elixir version, not results of our
attempts to reproduce it at the top of existing Python API), but it is
rather something that needs to be introduced to the language early, so
it is familiar to everyone.

Thanks,
Jan