[Python-ideas] Infix functions

Fri Feb 21 23:05:04 CET 2014

While we're discussing crazy ideas inspired by a combination of a long-abandoned PEP and Haskell idioms (see the implicit lambda thread), here's another: arbitrary infix operators:

    a `foo` b == foo(a, b)

I'm not sure there's any use for this, I just have a nagging feeling there _might_ be, based on thinking about how Haskell uses them to avoid the need for special syntax in a lot of cases where Python can't.

This isn't a new idea; it came up a lot in the early days of Numeric. PEP 225 (http://legacy.python.org/dev/peps/pep-0225/) has a side discussion on "Impact on named operators" that starts off with:

    The discussions made it generally clear that infix operators is a
    scarce resource in Python, not only in numerical computation, but
    in other fields as well.  Several proposals and ideas were put
    forward that would allow infix operators be introduced in ways
    similar to named functions.  We show here that the current
    extension does not negatively impact on future extensions in this
    regard.

The future extension was never written as a PEP because 225 and its competitors were were all deferred/abandoned. Also, most of the anticipated use cases for it back then were solved in other ways. The question is whether there are _other_ use cases that make the idea worth reviving.

The preferred syntax at that time was @opname. There are other alternatives in that PEP, but they all look a lot worse. Nobody proposed the `opname` because it meant repr(opname), but in 3.x that isn't a problem, so I'm going to use that instead, because…

In Haskell, you can turn prefix function into an infix operator by enclosing it in backticks, and turn any infix operator into a prefix function by enclosing it in parens. (Ignore the second half of that, because Python has a small, fixed set of operators, and they all have short, readable names in the operator module.) And, both in the exception-expression discussion and the while-clause discussion, I noticed that this feature is essential to the way Haskell deals with both of these features without requiring lambdas all over the place.

The Numeric community wanted this as a way of defining new mathematical operators. For example, there are three different ways you can "multiply" two vectors—element-wise, dot-product, or cross-product—and you can't spell all of them as a * b. So, how do you spell the others? There were proposals to add a new @ operator, or to double the set of operators by adding a ~-prefixed version of each, or to allow custom operators made up of any string of symbols (which Haskell allows), but none of those are remotely plausible extensions to Python. (There's a reason those PEPs were all deferred/abandoned.) However, you could solve the problem easily with infix functions:

    m `cross` n
    m `dot` n

In Haskell, it's used for all kinds of things beyond that, from type constructors:

    a `Pair` b
    a `Tree` (b `Tree` c `Tree` d) `Tree` e

… to higher-order functions. The motivating example here is that exception handling is done with the catch function, and instead of this:

   catch func (\e -> 0)

… you can write:

    func `catch` \e -> 0

Or, in Python terms, instead of this:

    catch(func, lambda e: 0)

… it's:

    func `catch` lambda e: 0

… which isn't miles away from:

    func() except Exception as e: 0

… and that's (part of) why Haskell doesn't have or need custom exception expression syntax.

PEP 225 assumed that infix functions would be defined in terms of special methods. The PEP implicitly assumed they were going to convince Guido to rename m.__add__(n) to m."+"(n), so m @cross n would obviously be m."@cross"(n). But failing that, there are other obvious possibilities, like m.__ at cross__(n), m.__cross__(n), m.__infix__('cross')(n), etc.

But really, there's no reason for special methods at all—especially with PEP 443 generic functions. Instead, it could just mean this:

    cross(m, n)

… just as in Haskell.

In fact, in cases where infix functions map to methods, there's really no reason not to just _write_ them as methods. That's how NumPy solves the vector-multiplication problem; the dot product of m and n is just:

    m.dot(n)

(The other two meanings are disambiguated in a different way—m*n means cross-product for perpendicular vectors, element-wise multiplication for parallel vectors.)

But this can still get ugly for long expressions. Compare:

    a `cross` b + c `cross` (d `dot` e)
    a.cross(b).add(c.cross(d.dot(e)))
    add(cross(a, b), cross(c, dot(d, e))

The difference between the first and second isn't as stark as between the second and third, but it's still pretty clear.

And consider the catch example again:

    func.catch(lambda e: 0)

Unless catch is a method on all callables, this makes no sense, which means method syntax isn't exactly extensible.

There are obviously lots of questions raised. The biggest one is: are there actually real-life use cases (especially given that NumPy has for the most part satisfactorily solved this problem for most numeric Python users)?

Beyond that:

What can go inside the backticks? In Haskell, it's an identifier, but the Haskell wiki (http://www.haskell.org/haskellwiki/Infix_expressions) notes that "In ABC the stuff between backquotes is not limited to an identifier, but any expression may occur here" (presumably that's not Python's ancestor ABC, which I'm pretty sure used backticks for repr, but some other language with the same name) and goes on to show how you can build that in Haskell if you really want to… but I think that's an even worse idea for Python than for Haskell. Maybe attribute references would be OK, but anything beyond that, even slicing (to get functions out of a table) looks terrible:

    a `self.foo` b
    a `funcs['foo']` b

Python 2.x's repr backticks allowed spaces inside the ticks. For operator syntax this would look terrible… but it does make parsing easier, and there's no reason to actually _ban_ it, just strongly discourage it.

What should the precedence and associativity be? In Haskell, it's customizable—which is impossible in Python, where functions are defined at runtime but calls are parsed at call time—but defaults to left-associative and highest-precedence. In Python, I think it would be more readable as coming between comparisons and bitwise ops. In grammar terms:

    infix_expr ::= or_expr | or_expr "`" identifier "`" infix_expr
    comparison ::= infix_expr ( comparison_operator infix_expr ) *

That grammar could easily be directly evaluated into a Call node in the AST, or it could have a different node (mainly because I'm curious whether MacroPy could do something interesting with it…), like:

    InfixCall(left=Num(n=1), op=Name(id='foo', ctx=Load()), right=Num(n=2))

Either way, it ultimately just compiles to the normal function-call bytecode, so there's no need for any change to the interpreter. (That does mean that "1 `abs` 2" would raise a normal "TypeError: abs expected at most 1 arguments, got 2" instead of a more specific "TypeError: abs cannot be used as an infix operator", but I don't think that's a problem.)

This theoretically could be expanded from operators to augmented assignment… but it shouldn't be; anyone who wants to write this needs to be kicked in the head:

    a `func`= b