Mailman 3 My thoughts on Pattern Matching. - Python-Dev

6 Nov 2020

      Apologies that this is a long email, but I want to make sure I get my
points across and it's difficult to do it in a short email. I touched on
some of these things in a blogpost I wrote (
https://pyrandom.blogspot.com/2020/11/my-view-of-python-steering-council.htm...)
but I wanted to make the main points in detail in a way that made it easy
for people to provide feedback if they wanted, so it's an email instead of
a blog post.

We haven't broadly announced it, but the current SC has decided not to make
the final decision
<https://github.com/python/steering-council/issues/39#issuecomment-720677712>
on Structural Pattern Matching (PEPs 634/635/636) and the
additional/competing PEPs (640 and 642), because it's so close to the
election of the next SC. Instead, we're going to make our decision and then
leave it as a strong recommendation for the 2021 SC, as they will actually
be in charge for the release of the next version of Python. The 2021 SC
will then have to make the final decision. I understand this may be
disappointing to some, and I'm sorry, but even after the SC election
there's more than five months before the final decision has to be made for
the feature to make it into Python 3.10. I hope the next SC doesn't take
that long, but there's no rush to get it in before they are seated.

I also imagine this means the Structural Pattern Matching proposal may be a
voting issue for some, and since I will be running for the next SC, I want
to make it clear where I stand and why.

Note that this is *my* point of view, not the SC's. I don't think any of
the SC have made up their mind yet, and we all approach this differently.
We're continuing the discussion between us to come to our
decision-cum-recommendation. I would also like to hear (in private or
public, I don't mind) whether people even remotely agree with me, because
my mind can still be changed on some of these issues -- especially if it
turns out I'm alone in my concerns.

First of all, I am honestly excited about pattern matching. The impact on
existing code or the examples in the PEPs may not feel impressive, but I
can see how it would change how we think about certain problems, how we
design APIs around them, much like for example decorators and the `with`
statement has in the past. Like both of those changes, the pattern matching
design gives us convenient inversion of control. I don't usually write code
using isinstance(), but I realise switching on types is something people
want to do and sometimes might even be the best thing to do. The Structural
Pattern Matching proposal solves this by allowing the types to decide what
they match against in a way that's easier to use and *much* more
purpose-driven than the existing hammers of `__isinstance__` and
`__issubclass__`.

My concerns about the proposal all stem from the differences with the rest
of the language. Some are very minor -- for example, I wish there was a
better way to deal with the indentation of 'match' and 'case', but all of
the proposed alternatives are clearly worse to me, so the double
indentation is fine. I'm satisfied all my minor concerns have been
addressed in the Structural Pattern Matching proposal, one way or another.

I also have some small but significant concerns that keep nagging at me,
but I'm not sure how much of a big deal they will be in practice. I don't
know whether these will be a problem, but I'm worried they _might turn out
to be_, and that in hindsight we should've made a different choice.
Ideally, we make decisions that allow us to correct the mistake (like
removing u'' strings in Python 3) rather than make it hard to ever change
(like indexing bytes producing integers). Those concerns are (although I
may have forgotten one or two minor ones):

 - Mapping patterns ignoring extra items. In the proposal, unpacking "into"
a dict automatically ignores extra items.There is syntax readily available
(and explicitly disallowed) for ignoring extra items. There are also ways
to check for extra keys, like a guard on the case, but if the program
doesn't think about adding it, extra items are just silently ignored. I'm
just not sure that the don't-care-about-extra-items use-case is common
enough to warrant silently doing the wrong thing, especially since it's so
very easy to write the code for that use-case.

 - The mixture of assignment, evaluation and matching in case clauses. I
keep worrying that it will be difficult to see what's being assigned to,
what's being called, and what's being matched against. I think the current
proposal has enough safe-guards to make it hard to *write* the wrong thing,
but readability still counts. The competing proposals for solving this
issue do not seem like improvements to me, however, for reasons I'll get to
below.

 - The use of `|` instead of `or`, which falls in the same boat as the
previous point: to my eyes `or` makes it much easier to parse the more
complex cases in the examples. For the simple case of 'case 0|1:' versus
'case 0 or 1:' it's not as big a deal, but it's not like `|` is somehow
better, either.

My primary hesitation, however, is much wider than those issues. It's the
incongruity between pattern matching and the rest of the language. And it's
not just because it makes it harder to teach, or harder to read or
maintain, or require some kind of mental mode switch as you go through the
code -- I *am* worried about those things, but to me they are not the main
problem. The primary reason I care about the integration with the rest of
Python is because it limits the future expansion of the language. Pattern
matching can be (and frequently is) seen to be an expansion of iterable
unpacking -- or, depending on your point of view, iterable unpacking is a
basic, limited form of pattern matching. Making the two _unnecessarily_
different makes it harder to unify them, and makes it harder to unify
_future_ changes to either of them.

For example, if pattern matching indeed takes off and we start seeing APIs
and conventions designed around it, we may want to consider (or reconsider)
adding mapping-unpacking to iterable unpacking as well: `{'spam': spam_var,
'ham': ham_var} = ...`. Or we may want simplified typed unpacking if we
find it's very common to write short match expressions, and instead write
`Viking(spam, ham) = ...`. We won't know which of those -- or of things we
can't imagine right now -- will make sense until after pattern matching is
accepted and has seen plenty of use. We've seen that with `with`, with
decorators, with generators, with generator-style coroutines, etc. And it
also goes the other way: there may be changes to other parts of Python that
we want to make work with pattern matching, even the bits of pattern
matching that aren't exactly like the rest of Python -- we may end up
expanding the assignment syntax to add some feature, and then find we want
that feature in match cases, which aren't exactly like other assignment
targets. To me, that's the big reason to make sure any new proposal lines
up well with the rest of Python.

In the Structural Pattern Matching proposal, as I see it, the main
incongruities are the big expansion of assignment (which is part and parcel
to pattern matching) with its subtle differences between stores and loads,
and the special semantics for `_`. It doesn't matter that we can live with
those differences; what matters is what's going to happen in the long run
with those differences. If we want to copy some of the semantics and syntax
from pattern matching to other parts of Python, can we? We can do that with
the expanded assignment targets to some extent, perhaps even to the fullest
sensible extent. We can't really do that with the special semantics of `_`.
It would mean some significant confusion around the use of _ for i18n, even
if we figured out a way of handling the conflict.

This is why I proposed PEP 640. By using '?' -- or whatever else that isn't
a valid identifier -- we avoid an insurmountable problem in the future, and
by applying it to iterable unpacking we're already closing the gap. The
entire cost of this proposal is using something other than what other
languages use, which is hardly unique in Python. But failing that, I want
to see _some_ way towards closing the gap between pattern matching and
existing assignment semantics. If `_` becomes special in pattern matching,
we should have a long-term plan for making it special everywhere, or we're
left with a gap we can only ever widen, not close.

It's also why I'm not in favour of PEP 642 and other proposals for solving
some of the problems in the Structural Pattern Matching proposal (sigils,
etc): it widens the gap instead of closing it. It creates more special
syntax that will be harder to apply to other parts of Python. I think we
need something that fits well with the rest of Python, not a separate
sub-language.

Like I said, I haven't made up my mind either way, and I don't want to turn
this into the single issue on which to vote _for_ me -- the SC has a lot of
other, important stuff on its plate, and you can disagree with me on any of
them! -- but I want to be entirely clear on where I stand on this issue,
lest people _regret_ voting for me because of it.

-- 
Thomas Wouters <thomas@python.org>

Hi! I'm an email virus! Think twice before sending your email to help me
spread!

My thoughts on Pattern Matching.

Thomas Wouters

Senthil Kumaran

Greg Ewing

Nick Coghlan

Tobias Kohn

Brett Cannon

Nick Coghlan

tags

participants (6)