Hi,

I read the PEP, and a few thoughts:

-----

I think one of the examples is some lib2to3 code? I think the matcher syntax is really great for that case (parse trees). The matcher syntax is definitely an improvement over the litany of helper functions and conditionals otherwise needed.

That said, I have a hard time seeing a particular use of this complicated pattern matching outside "hetergenous trees" (for lack of a better term) of objects? I've only really dealt with that problem with parse trees, but perhaps that just an artifact of the domains I've ended up working in.

In any case, it might be useful to include some/more examples or use cases that aren't as parser-centric.

-----

Question: How are True, False, None, ..., etc handled? What does this do?

case whatever:
  case True: ...
  case False: ...
  case None: ...
  case ...:

I would expect they would be treated as literals the same as e.g. numbers/strings, yes? Sorry if I missed this in the PEP.

-----

I, too, had trouble understanding the __match__ protocol from the PEP text. Brett's comments largely capture my thoughts about this.

-----

The need to use "." to indicate "look up name" to avoid "match anything" seems like a big foot gun. Simple examples such as:

FOO = 1
match get_case():
  case FOO:
    print("you chose one")

clearly illustrate this, but the problem is present in any case expression: a missing dot changes the meaning from "match this specific value" to almost the opposite: "match any value". And all you really need to do is miss a single leading dot anywhere in the case expression to trigger this. I agree with Barry (I think he said this) that it seems like an easy cause of mysterious bugs.

I think the foot-gun aspect derives directly from the change in how a symbol is interpreted. i.e., Everywhere (predominantly? everything I can think of atm) else in the language when you see "foo", you know it means some sort of lookup of the name "foo" is occurring. The exception to this is fairly simple: when there is some "assignment cue", e.g. "as", :=, =, import, etc, and those assignment cues are always very close by (pretty much always the leading/following token?). Anyways, my point is assignment has a cue close by.

The proposed syntax flips that and mixes it, so it's very confusing. Sometimes a symbol is a lookup, sometimes it's an assignment.

The PEP talks a bit about this in the "alternatives for constant value pattern" section. I don't find the rationale in that section particularly convincing. It basically says using "$FOO" to act as "look up value named FOO" is rejected because "it is new syntax for a narrow use case" and "name patterns are common in typical code ... so special syntax for the common case would be weird".

I don't find that convincing because it seems more weird to change the (otherwise consistent) lookup/assignment behavior of the language for a specific sub-syntax.

Anyways, when I rewrite the examples and use a token to indicate "matcher", I personally find them easier to read. I feel this is because it makes the matcher syntax feel more like templates or string interpolation (or things of that nature) that have some "placeholder" that gets "bound" to a value after being given some "input".

It also sort of honors the "assignment only happens with a localized cue" behavior that already exists.

ORIGIN = 0
case get_point():
  case Point(ORIGIN, $end):
    ...
  case $default:
    print(default)

I will admit this gives me PHP flashbacks, but it's also very clear where assignments are happening, and I can just use the usual name-lookup rules. I just used $ since the PEP did.

As a bonus, I also think this largely mediates the foot gun problem because there's now a cue a binding is happening, so it's easy to trigger our "is that name already taken, is it safe to assign?" check we mentally perform.

In any case, this seems like a pretty fundamental either/or design decision someone will have to make:

Either:
  names mean assignment, and the rules of what is a lookup vs assignment are different with some special case support (i.e. leading dot).
Or:
  use some character to indicate assignment, and the lookup rules are the same.

-----

Related to the above: I also raise this because, in my usage, I doubt I'll be using it as much more than a switch statement. I rarely have to match complicated patterns, but very often have a set of values that I need to test against. The combination of Literal and exhaustive-case checking is very appealing.

So I'm very often going to want to type, e.g.

ValidModes = Union[Literal[A], Literal[B], etc etc]
def foo(mode: ValidModes):
  match mode:
    case A: ...
    case B: ...
    case etc etc

And eventually I'm going to foot-gun myself with a missing dot.

-----

Related to the above, I don't find that e.g. "case Point(...)" not initializing a Point particularly confusing. This feels like it might be inconsistent with my whole thing above, but :shrug:. FWIW, I suspect it's just that the leading "case" cue makes it easy to entirely turn off the "parentheses means code gets called" logic in my mind-parser.

-----

Related to the above, perhaps an unadorned name shouldn't be allowed? e.g. this should be invalid:

match get_shape():
  case shape:
    print(shape)

I raise this idea because of the foot-gun issue, but also because it creates more ways of doing the same thing: binding the name to a value. Using := doesn't seem like a particularly burdensome solution:

match shape := get_shape():
  case: # or *, or _, or whatever
    print(shape)

And then either only dotted names or patterns are allowed in cases, not plain names.
-----

Making underscore a special match-anything-but-don't-bind struck me as a bit odd. Aside from the language grammar rules, there aren't really any "this is an OK name, this isn't" type of rules.

I think someone else mentioned using "*" instead of "_"? I had the same exact same thought. If it's not going to be bound to a name, why use an otherwise valid name to not bind it to? I get the ergonomics of it, but it seems like another special-case of how things get processed inside the case expression.

-----

Why | instead of "or" ? "or" is used in other conditionals. This strikes me as another special case of the syntax that differs from elsewhere in the language.

-----

I agree with not having flat indentation. I think having "case" indented from "match" makes it more readable overall.

-----

Anyways, thanks for reading. HTH.


On Tue, Jun 23, 2020 at 9:08 AM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.

Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).

I'll mostly let the PEP speak for itself:
- Published: https://www.python.org/dev/peps/pep-0622/ (*)
- Source: https://github.com/python/peps/blob/master/pep-0622.rst

(*) The published version will hopefully be available soon.

I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!

I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:


# Pattern Matching

This repo contains a draft PEP proposing a `match` statement.

Origins
-------

The work has several origins:

- Many statically compiled languages (especially functional ones) have
  a `match` expression, for example
  [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html),
  [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html),
  [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-matching);
- Several extensive discussions on python-ideas, culminating in a
  summarizing
  [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python/)
  by Tobias Kohn;
- An independently developed [draft
  PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst)
  by Ivan Levkivskyi.

Implementation
--------------

A full reference implementation written by Brandt Bucher is available
as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of
the CPython repo.  This is readily converted to a [pull
request](https://github.com/brandtbucher/cpython/pull/2)).

Examples
--------

Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.

Tutorial
--------

A `match` statement takes an expression and compares it to successive
patterns given as one or more `case` blocks.  This is superficially
similar to a `switch` statement in C, Java or JavaScript (an many
other languages), but much more powerful.

The simplest form compares a target value against one or more literals:

```py
def http_error(status):
    match status:
        case 400:
            return "Bad request"
        case 401:
            return "Unauthorized"
        case 403:
            return "Forbidden"
        case 404:
            return "Not found"
        case 418:
            return "I'm a teapot"
        case _:
            return "Something else"
```

Note the last block: the "variable name" `_` acts as a *wildcard* and
never fails to match.

You can combine several literals in a single pattern using `|` ("or"):

```py
        case 401|403|404:
            return "Not allowed"
```

Patterns can look like unpacking assignments, and can be used to bind
variables:

```py
# The target is an (x, y) tuple
match point:
    case (0, 0):
        print("Origin")
    case (0, y):
        print(f"Y={y}")
    case (x, 0):
        print(f"X={x}")
    case (x, y):
        print(f"X={x}, Y={y}")
    case _:
        raise ValueError("Not a point")
```

Study that one carefully!  The first pattern has two literals, and can
be thought of as an extension of the literal pattern shown above.  But
the next two patterns combine a literal and a variable, and the
variable is *extracted* from the target value (`point`).  The fourth
pattern is a double extraction, which makes it conceptually similar to
the unpacking assignment `(x, y) = point`.

If you are using classes to structure your data (e.g. data classes)
you can use the class name followed by an argument list resembling a
constructor, but with the ability to extract variables:

```py
from dataclasses import dataclass

@dataclass
class Point:
    x: int
    y: int

def whereis(point):
    match point:
        case Point(0, 0):
            print("Origin")
        case Point(0, y):
            print(f"Y={y}")
        case Point(x, 0):
            print(f"X={x}")
        case Point():
            print("Somewhere else")
        case _:
            print("Not a point")
```

We can use keyword parameters too.  The following patterns are all
equivalent (and all bind the `y` attribute to the `var` variable):

```py
Point(1, var)
Point(1, y=var)
Point(x=1, y=var)
Point(y=var, x=1)
```

Patterns can be arbitrarily nested.  For example, if we have a short
list of points, we could match it like this:

```py
match points:
    case []:
        print("No points")
    case [Point(0, 0)]:
        print("The origin")
    case [Point(x, y)]:
        print(f"Single point {x}, {y}")
    case [Point(0, y1), Point(0, y2)]:
        print(f"Two on the Y axis at {y1}, {y2}")
    case _:
        print("Something else")
```

We can add an `if` clause to a pattern, known as a "guard".  If the
guard is false, `match` goes on to try the next `case` block.  Note
that variable extraction happens before the guard is evaluated:

```py
match point:
    case Point(x, y) if x == y:
        print(f"Y=X at {x}")
    case Point(x, y):
        print(f"Not on the diagonal")
```

Several other key features:

- Like unpacking assignments, tuple and list patterns have exactly the
  same meaning and actually match arbitrary sequences.  An important
  exception is that they don't match iterators or strings.
  (Technically, the target must be an instance of
  `collections.abc.Sequence`.)

- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y,
  *rest)` work similar to wildcards in unpacking assignments.  The
  name after `*` may also be `_`, so `(x, y, *_)` matches a sequence
  of at least two items without binding the remaining items.

- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the
  `"bandwidth"` and `"latency"` values from a dict.  Unlike sequence
  patterns, extra keys are ignored.  A wildcard `**rest` is also
  supported.  (But `**_` would be redundant, so it not allowed.)

- Subpatterns may be extracted using the walrus (`:=`) operator:

  ```py
  case (Point(x1, y1), p2 := Point(x2, y2)): ...
  ```

- Patterns may use named constants.  These must be dotted names; a
  single name can be made into a constant value by prefixing it with a
  dot to prevent it from being interpreted as a variable extraction:

  ```py
  RED, GREEN, BLUE = 0, 1, 2

  match color:
      case .RED:
          print("I see red!")
      case .GREEN:
          print("Grass is green")
      case .BLUE:
          print("I'm feeling the blues :(")
  ```

- Classes can customize how they are matched by defining a
  `__match__()` method.
  Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specification) for details.



--
--Guido van Rossum (python.org/~guido)
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7LTSC3QSNIZPNZ26FZ3ZEUCZ3C/
Code of Conduct: http://python.org/psf/codeofconduct/