[Python-Dev] Re: PEP 622: Structural Pattern Matching

24 Jun 2020

      On 23/06/2020 17:01, Guido van Rossum wrote:
...
I'm happy to present a new PEP for the python-dev community to review. This
is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Can I just say thanks for all this work.  I really like the concept, but 
like everyone else I have opinions on the details :-)  I didn't have 
time to read the PEP until about midnight last night, and responding 
then seemed like a bad idea, so apologies about being late to this game. 
  My basic problem can be summed up as the more I read, the more it 
seemed like exceptions were breeding.

Running through the basics, I'm very happy with the indentation of

   match expression:
       case pattern1:
           suite1()
       case pattern2:
           suite2()

since it fits exactly how I indent switch statements in C :-)

The patterns are where things start coming unstuck.

Literal patterns: fine, no problem, everything works as you would 
expect.  Not being able to use expressions is a disappointment I'm used to.

Name patterns: um.  OK, I can cope with that.  If you squint, it looks 
like an assignment and/or unpacking, _except_ for "_".  Which, what? 
Just because I use "_" as a throwaway doesn't mean I'll never want to 
use it as an actual name.  Do we actually need a non-binding wildcard 
anyway?  It may make some matches a little faster, I suppose, but we're 
always being told names are cheap.

And yes, I think "case _:" is a rubbish way of spelling "else:".  I'd 
honestly be more likely to write "case everything_else:" or even "case 
dummy:" than "case _:" just to be more readable.

Constant value patterns: now I'm getting really uneasy.  A leading dot 
is all but invisible to the reader, and we are compounding the 
specialness of "_".  I'm having to squint harder to see name patterns as 
assignments.  The exceptions to how I would normally read Python code 
are breeding and getting more complicated.

Sequence patterns: and we have more exceptions.  I guess the current 
syntax would need some cooperation from the string classes, but I'd 
quite like to be able to take byte protocols apart and do something like

   match msg:
     case bytes((len, 0, cmd, *rest)):
       process_command[cmd](len, rest)
     case bytes((len1, len2, 0, cmd, *rest)) if len1 >= 0x80:
       process_command[cmd]((len1 & 0x7f) | (len2 << 7), rest)
     else:
       handle_comms_error(msg)

(the protocol I'm currently working on is that horrid :-)

Mapping patterns: makes sense, pace my uneasiness about names.

Class patterns: this crosses the line from uneasy to outright "no".  I'm 
fairly confident I will never read "case Point(x,y):" without thinking 
first there's an instantiation happening.  It gets even worse when you 
add named subpatterns, because in

   case name := Class(x, y)

the "Class(x,y)" part both is and is not an instantiation.  I honestly 
stared at the example in the PEP for a good ten minutes before I grokked 
it properly.

I only have a problem with using "|" to combine patterns because I'd 
really like to have expressions as patterns :-)

The way that exceptions to the usual rules of reading Python got more 
numerous and more complicated the further I read through the PEP makes 
me think the approach to when to use name-binding and when to use values 
may be arse-backwards.  The PEP justifies its approach by pointing out 
that name patterns are more common in typical code, which is fine for 
name patterns, but looks weird for class patterns and really weird when 
you involve named subpatterns.

Here's a quick sketch of rearranging the syntax with that in mind.  Bits 
of it aren't lovely, but I still think they read more naturally than the 
current PEP.

Literal patterns: as before.  It's a classic.

Constant value patterns: just use the name:

   BLACK = 1

   match colour:
     case "Not a colour, guv":
       print("Um")
     case BLACK:
       print("Paint It Black!")

There's an obvious generalisation to constant expressions that would be 
really nice from my point of view.

Class patterns: don't use syntax that looks like instantiation!

   case Point as x, y:

(I used "as" because it's short.  Someone else suggested "with", which 
probably makes more sense.)  This gets a little messy when patterns 
start nesting.

   case Line as (start := (Point as x, y), end) if start == end:

but the original example in the PEP is just as messy and puts a lot more 
of the wrong thoughts in my head.

Name patterns fall naturally out of this, even if they look a little 
unusual:

   case int as x:

The catch-all

   case object as obj:

looks odd, but actually I don't have a problem with that.  How often 
should code be catching any old thing rather than something of a more 
specific class or classes?  If the catch-all looks odd, perhaps it will 
dissuade people from using it thoughtlessly.

Sequence and mapping patterns are a bit more than just syntactic sugar 
for class patterns for list, tuple, and dict.  I wouldn't change them, 
given other changes to the definition of "pattern" here.

To me, this is just easier to wrap my head around.  Patterns are either 
expressions, classes or sequence/mapping combinations of patterns. 
There's no awkwardness about when is a name a value and when is it 
something to be bound, there's no proliferation of special cases, and it 
is pretty readable I think.  It could be argued to be verbose, but 
terseness is not one of Python's objectives, and I think the consistency 
is worth it.

-- 
Rhodri James *-* Kynesim Ltd