By what I'm about to write below, I've aimed at a design to meet needs resembling those that PEP 622 deals with, although with some differences, especially in emphasis.
I'm not writing a full introduction here; the intended audience of this email is people somewhat familiar with PEP 622 and its discussions. This email doesn't have much structure or anything, but I hope it's sufficiently clear.
Things that this design aims to address:
The most difficult thing is perhaps to understand what names mean: what is being bound to, and what is a value defined elsewhere and so on. For example, if a (somewhat unrealistic) pattern looks like
Point3D(x=3.14, y=6, z=_)
, there are four names that refer to something:
z. To understand this, it is useful to think of the pattern as
corresponding to an expression, although it is not treated as quite the
same in the end. (So, here x, y, z refer to internals/arguments of Point3D)
The situation becomes more difficult when the values to compare with are in variables, and/or if one wishes to extract a value from the structure.
Point3D(x=pi, y=SIX, z=value)
Now there is no way to tell, from this, which names refer to existing
objects and which should be bound to by the operation, except by guessing.
value is supposed to be a binding target.
Python already has destructuring assignment (
a, b, *rest = values), which
is similar to what happens in function calls:
def func(a, b, *rest):
However, compared to this, it is more useful to see patterns as working backwards compared to this. For example, the semantics of the names in
lambda value: Point3D(x=pi, y=SIX, z=value)
are exactly as desired. That is, when the pattern matching is interpreted
as "does this object look like what this function would produce, and if so,
what would be the arguments of the function in that case?". Matching the
pattern would bind to
So, with this, a previous understanding of function definitions already gives you a mental model for how names work in patterns and for what the pattern is supposed to do.
In theory, the lambda expression could BE the syntax for a pattern.
However, many have wished for a different syntax even for lambdas. A
slightly nicer form would be to omit the keyword
lambda, but then one
would still have to repeat the names to be bound. To avoid that, we need a
way to explicitly mark "not-yet-bound names":
Point3D(x=pi, y=SIX, z=value?)
Before going any further, how would one invoke the matching machinery? It could be
<expression> matches <pattern>
and that would evaluate to a boolean-like value.
With this syntax, the
is_tuple example from PEP 622 would look something
def is_tuple(node: Node) -> bool: if node matches Node(children=[LParen(), RParen()]): return True elif node matches Node(children=[Leaf(value="("), Node(), Leaf(value=")")]): return True return False
Or something like:
def is_tuple(node: Node) -> bool: if node matches tuple_pattern: return True return False
tuple_pattern would be a pattern pre-defined with a function-like
syntax. Also, if there is a
not needed at all.
Note that in PEP 622, is_tuple uses a match statement with three cases. So, in effect, the full tuple_pattern had been split into subpatterns. This was only possible because it was an OR pattern. In general, splitting a longer pattern into cases like that is not possible. Longer function-like patterns, on the other hand, can be expressed using more pattern functions – just like regular functions can use helper functions.
Here, tuple_pattern isn't passed any arguments. It also doesn't have any parameters. However, if it did, those would be considered wildcards, as I believe we'd want both the programmer and the compiler/optimizers to explicitly see, from the match expression, which names will/would be bound. If something other than wildcard behavior is desired, that should be explicitly specified in a "pattern function call".
Let's take another example pattern:
pdef main_diagonal_point(pos): return Point3D(pos, pos, pos)
main_diagonal_point(0) would refer to the origin, and
main_diagonal_point would be true for every point with
x == y == z.
point matches main_diagonal_point(pos?) should only match
x == y == z — and then bind that value to
pos. However, one
expect to be able to write the same thing inline as
point matches Point3D(pos?, pos?, pos?)
, so based on that, multiple occurrences of the same binding target should ensure equality.
What about wildcards? If ? were the wildcard, that would mean that
point matches main_diagonal_point(?)
should NOT mean the same as
point matches Point3D(?,?,?)
So each occurrence of ? would have to be treated as a "new" wildcard, but
when it might be passed as an argument to a sub-pattern, it will be
equivalent to effectively something like
1? and so on)
the tree occurrences (in the former case) would have to be the same. This
is not the only worry about using ? for wildcards. I don't find _ as
wildcard really that problematic, although it is a bit more problematic
here than in PEP 622.
Another option would be to use
Any as wildcard. However, that would sound
like "any type of object is fine", although (hopefully) it is most often
clear what the type should be, and that it is the value that can be
anything. (And if the type is clear, no isinstance is necessary.)
Then the pattern for
isinstance(obj, Class). Quite similarly to using a
"pattern function" without arguments above, this could be
Class. This means that matching with an instance of
type (inside a
pattern) would default to an instance check, while other
might perhaps default to
== if nothing else is specified.
Walrus patterns? While I think people probably understood PEP 622 walrus patterns quite well, I think walrus patterns harm the understanding patterns in general, because they break the mental model of going "backwards" in some sense. The walrus pattern in PEP 622 would be perfectly described by an AND pattern instead. This is because a binding target in PEP 622 is considered a "capture pattern". The django example in PEP 622 would then be
match value: case [*v, label & (Promise() | str())] if v: value = tuple(v) case _: label = key.replace('_', ' ').title()
However, I prefer not to think of a binding target as a "pattern", just like I don't think patterns are assignment targets. Instead, here, one might "annotate" a "not-yet-bound name" with a (sub-)pattern:
*v?, label?(Promise | str)
This would bind to
label as well as check that the
thing to be
label matches Promise | str.
The django example could be written like this:
if value matches **v?, label?(Promise | str) and v: value = tuple(v)else*: label = key.replace('_', ' ').title()
However, it would be quite possible to not add this possibility now, and instead use:
if value matches **v?, label? and v and label matches Promise | str: value = tuple(v)else*: label = key.replace('_', ' ').title()
I think I'll stop here for now – the internal workings and dunder methods are a whole different story.
I hope this was somewhat understandable. I know I didn't explicitly explain all the semantics – I tried to hit the main points and avoid distractions. If something was unclear etc., I'll be happy to answer to any questions or concerns :).