[Python-ideas] Pattern Matching Syntax (reprise)
Tobias Kohn
kohnt at tobiaskohn.ch
Tue Sep 18 07:29:35 EDT 2018
Hello Everyone,
Please excuse my being late for properly responding to the last thread
on "Pattern Matching Syntax" [1]. As Robert Roskam has already
pointed out at the beginning of that thread, there has been much
previous discussion about adding pattern matching to Python, and
several proposals exist. It is therefore not my intention to propose
yet another syntax choice for pattern matching, but more to share my
experience with implementing it, in the hope of making a worthwhile
contribution to the overall discussion.
This summer, I basically ended up requiring pattern matching in Python
for a research project I am working on. Some initial hacks have then
grown into a library for pattern matching in Python [2]. On the one
hand, my design is certainly heavily influence by Scala, with which I
also work on a regular basis. On the other hand, I ran into various
difficulties, challanges, and it has been particularly important to me
to find a design that blends well with Python, and harnesses what
Python already offers.
I have written down my experience in the form of a discussion on
several options concerning syntax [3], and implementation [4],
respectively. As the articles have turned out longer than I
originally intended, it might take up too much time for those who have
little interest in this matter in the first place. However,
considering that the subject of pattern matching has been coming up
rather regularly, my experience might help to contribute something to
the discussion. Let me provide a brief summary here:
1. PATTERN MATCHING IS NOT SWITCH/CASE
--------------------------------------
When I am talking about pattern matching, my goal is to do a deep
structural comparison, and extract information from an object.
Consider, for instance, the problem of optimising the AST of a Python
program, and eliminate patterns of the form `x + 0`, and `x - 0`.
What pattern matching should be offering here is a kind of comparison
along the lines of:
`if node == BinOp(?, (Add() or Sub()), Num(0)): ...`
Ideally, it should also return the value of what is marked by the
question mark `?` here, and assign it to a variable `left`, say. The
above comparison is often written as something like, e. g.:
`case BinOp(left, Add()|Sub(), Num(0)): ...`
This use of `case`, however, is not the same as a switch-statement.
2. ORTHOGONALITY
----------------
Getting the syntax and semantics of nested blocks right is hard.
Every block/suite in Python allows any kind of statements to occur,
which allows for things like nested functions definitions, or having
more than just methods in a class. If we use a two-levelled
block-structure, we run into the problem of finding good semantics for
what the following means (note the variable `x` here):
```
match node:
x = 0
case BinOp(left, Add(), Num(0)):
...
x += 1
case BinOp(left, Mul(), Num(1)):
...
```
In the case of a "switch block", such additional statements like the
`x=0`, and `x+=1` can become quite problematic. On the other hand,
requiring all statements inside the block to be case-statements
violates the orthogonality found otherwise in Python.
I feel that this dilemma is one of the core issues why the syntax of
switch statements, or pattern matching seems so exceptionally hard.
In the end, it might therefore, indeed, make more sense to find a
structure that is more in line with if/elif/else-chains. This would
lead to a form of pattern matching with little support for
switch-statement, though.
3. IMPLEMENTATION
-----------------
For the implementation of pattern matching, my package compiles the
patterns into context-manager-classes, adds these classes in the
background to the code, and then uses `with`-statements to express the
`case`-statement. If have found a neat way to make the execution of a
`with`-statement's body conditional.
Two things have been curcially important in the overall design: first,
the "recompilation" cannot change the structure of the original code,
or add any line. This way, all error messages, and tooling should
work as expected; the invasive procedure should be as minimal as
possible. Second, it is paramount that the semantics of how Python
works is being preserved. Even though the actual matching of patterns
is done in external classes, all names must be resolved "locally"
where the original `case`-statement lives. Similarly, the variables
defined by the pattern are to local variables, which are assigned if,
and only if, the pattern actually matches.
Should pattern matching ever be added to Python properly, then there
will be no need to use `with`-statements and context managers, of
course. But the implementation must make sure, nonetheless, that the
usual semantics with name resolving of Python is being respected.
4. SYNTAX OF PATTERNS
---------------------
The syntax of patterns themselves has been guided by two principles
(again). First, if there is a way to already express the same thing
in Python, use that. This applies, in particular, to sequence
unpacking. Second, patterns specify a possible way how the object in
question could have been created, or constructed in the first place.
Hence, it is no accident that the pattern `BinOp(left, Add(), Num(0))`
above looks almost like a constructor.
Other implementation are happy to redefine the meanings of operators
like `in`, or `is` (as in, e. g., `case x is str:`). While this might
look very convenient at first, it will most probably lead to many
subsequent bugs where people write `if x is str:` to mean the same
thing (namely test the type of `x`). Even though expressing patterns
requires reusing some operators in a new way, we have to extremely
careful to choose widely, and minimise confusion, and surprises.
Pattern matching could certainly make a great addition to Python, and
various current implementations act as proof of concepts. However,
choosing an appropriate syntax for pattern matching is hard, and we
should work hard to make sure that any such addition feels natural in
Python, even at the expense of having to write more, and being not as
terse as other languages.
I hope my thoughts in this matter can make a worthwhile constribution
to the discussion. And I would like to emphasise once more, that my
goal is not to propose a new syntax for pattern matching, but to
report on my experience while implementing it.
Kind regards,
Tobias Kohn
[1] https://groups.google.com/d/topic/python-ideas/nqW2_-kKrNg/discussion
[2] https://github.com/Tobias-Kohn/pyPMatch
[3]
https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python/
[4] https://tobiaskohn.ch/index.php/2018/09/12/implementing-pattern-matching/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180918/af6ea3b2/attachment.html>
More information about the Python-ideas
mailing list