[Python-ideas] Pattern Matching Syntax (reprise)

Tobias Kohn kohnt at tobiaskohn.ch
Tue Sep 18 07:29:35 EDT 2018


Hello Everyone,

Please excuse my being late for properly responding to the last thread  
on "Pattern Matching Syntax" [1].  As Robert Roskam has already  
pointed out at the beginning of that thread, there has been much  
previous discussion about adding pattern matching to Python, and  
several proposals exist.  It is therefore not my intention to propose  
yet another syntax choice for pattern matching, but more to share my  
experience with implementing it, in the hope of making a worthwhile  
contribution to the overall discussion.

This summer, I basically ended up requiring pattern matching in Python  
for a research project I am working on.  Some initial hacks have then  
grown into a library for pattern matching in Python [2].  On the one  
hand, my design is certainly heavily influence by Scala, with which I  
also work on a regular basis.  On the other hand, I ran into various  
difficulties, challanges, and it has been particularly important to me  
to find a design that blends well with Python, and harnesses what  
Python already offers.

I have written down my experience in the form of a discussion on  
several options concerning syntax [3], and implementation [4],  
respectively.  As the articles have turned out longer than I  
originally intended, it might take up too much time for those who have  
little interest in this matter in the first place.  However,  
considering that the subject of pattern matching has been coming up  
rather regularly, my experience might help to contribute something to  
the discussion.  Let me provide a brief summary here:

1. PATTERN MATCHING IS NOT SWITCH/CASE
--------------------------------------
When I am talking about pattern matching, my goal is to do a deep  
structural comparison, and extract information from an object.   
Consider, for instance, the problem of optimising the AST of a Python  
program, and eliminate patterns of the form `x + 0`, and `x - 0`.   
What pattern matching should be offering here is a kind of comparison  
along the lines of:
`if node == BinOp(?, (Add() or Sub()), Num(0)): ...`
Ideally, it should also return the value of what is marked by the  
question mark `?` here, and assign it to a variable `left`, say.  The  
above comparison is often written as something like, e. g.:
`case BinOp(left, Add()|Sub(), Num(0)): ...`
This use of `case`, however, is not the same as a switch-statement.

2. ORTHOGONALITY
----------------
Getting the syntax and semantics of nested blocks right is hard.   
Every block/suite in Python allows any kind of statements to occur,  
which allows for things like nested functions definitions, or having  
more than just methods in a class.  If we use a two-levelled  
block-structure, we run into the problem of finding good semantics for  
what the following means (note the variable `x` here):
```
match node:
    x = 0
    case BinOp(left, Add(), Num(0)):
        ...
    x += 1
    case BinOp(left, Mul(), Num(1)):
        ...
```
In the case of a "switch block", such additional statements like the  
`x=0`, and `x+=1` can become quite problematic.  On the other hand,  
requiring all statements inside the block to be case-statements  
violates the orthogonality found otherwise in Python.

I feel that this dilemma is one of the core issues why the syntax of  
switch statements, or pattern matching seems so exceptionally hard.   
In the end, it might therefore, indeed, make more sense to find a  
structure that is more in line with if/elif/else-chains.  This would  
lead to a form of pattern matching with little support for  
switch-statement, though.

3. IMPLEMENTATION
-----------------
For the implementation of pattern matching, my package compiles the  
patterns into context-manager-classes, adds these classes in the  
background to the code, and then uses `with`-statements to express the  
`case`-statement.  If have found a neat way to make the execution of a  
`with`-statement's body conditional.

Two things have been curcially important in the overall design: first,  
the "recompilation" cannot change the structure of the original code,  
or add any line.  This way, all error messages, and tooling should  
work as expected; the invasive procedure should be as minimal as  
possible.  Second, it is paramount that the semantics of how Python  
works is being preserved.  Even though the actual matching of patterns  
is done in external classes, all names must be resolved "locally"  
where the original `case`-statement lives.  Similarly, the variables  
defined by the pattern are to local variables, which are assigned if,  
and only if, the pattern actually matches.

Should pattern matching ever be added to Python properly, then there  
will be no need to use `with`-statements and context managers, of  
course.  But the implementation must make sure, nonetheless, that the  
usual semantics with name resolving of Python is being respected.

4. SYNTAX OF PATTERNS
---------------------
The syntax of patterns themselves has been guided by two principles  
(again).  First, if there is a way to already express the same thing  
in Python, use that.  This applies, in particular, to sequence  
unpacking.  Second, patterns specify a possible way how the object in  
question could have been created, or constructed in the first place.   
Hence, it is no accident that the pattern `BinOp(left, Add(), Num(0))`  
above looks almost like a constructor.

Other implementation are happy to redefine the meanings of operators  
like `in`, or `is` (as in, e. g., `case x is str:`).  While this might  
look very convenient at first, it will most probably lead to many  
subsequent bugs where people write `if x is str:` to mean the same  
thing (namely test the type of `x`).  Even though expressing patterns  
requires reusing some operators in a new way, we have to extremely  
careful to choose widely, and minimise confusion, and surprises.

Pattern matching could certainly make a great addition to Python, and  
various current implementations act as proof of concepts.  However,  
choosing an appropriate syntax for pattern matching is hard, and we  
should work hard to make sure that any such addition feels natural in  
Python, even at the expense of having to write more, and being not as  
terse as other languages.

I hope my thoughts in this matter can make a worthwhile constribution  
to the discussion.  And I would like to emphasise once more, that my  
goal is not to propose a new syntax for pattern matching, but to  
report on my experience while implementing it.

Kind regards,
Tobias Kohn

[1] https://groups.google.com/d/topic/python-ideas/nqW2_-kKrNg/discussion
[2] https://github.com/Tobias-Kohn/pyPMatch
[3]  
https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python/
[4] https://tobiaskohn.ch/index.php/2018/09/12/implementing-pattern-matching/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180918/af6ea3b2/attachment.html>


More information about the Python-ideas mailing list