Thank you very much to Brandt, Tobias, Ivan, Guido, and Talin for the extensive work on this PEP. The attention to detail and effort that went into establishing the real-world usefulness of this feature (such as the many excellent examples and code analysis) helps a lot to justify the complexity of the proposed pattern matching feature(s). The concern about the added complexity to the language is certainly reasonable (particularly in the sphere of researchers and other non-developers), but I think this is a case where practicality will ultimately win. Overall, I am +1 for this PEP. That being said, I would like to state a few opinions (with #2 being my strongest one): 1) I was initially in agreement about the usage "else:" instead of "match _:", but upon further consideration, I don't think "else:" holds up very well in the context of pattern matching. Although it could be confusing at a first glance (admittedly, it threw me off at first), an underscore makes far more sense as a wildcard match; especially considering the existing precedent. 2) Regarding the constant value pattern semantics, I'm okay with the usage of the "." in general, but I completely agree with several others that it's rather difficult to read when there's a leading period with a single word, e.g. ".CONSTANT". To some degree, this could probably be less problematic with some reasonably good syntax highlighting to draw attention to the leading period. However, I don't think it should be at all necessary for people to rely on syntax highlighting to be able to clearly see something that's part of a core Python language feature. It seems especially detrimental for those with visual impairment. As someone with relatively poor eye-sight who typically has to blow up the font size for my code to be readable (and often finds syntax highlighting to be distracting), I'm not really looking forward to squinting for missed leading periods when it was intended to refer to a constant reference. Even if it's a relatively uncommon case, with a core feature, it's bound to happen enough to cause some headaches. From the "Rejected Ideas" section:
Introduce a special symbol, for example $ or ^ to indicate that a given name is a constant to be matched against, not to be assigned to:
FOO = 1 value = 0
match value: case $FOO: # This would not be matched ... case BAR: # This would be matched ...
The problem with this approach is that introducing a new syntax for such narrow use-case is probably an overkill.
I can certainly understand that it seems overkill to have a separate symbol for this, but in my biased opinion, I think it's worth a stronger consideration from the perspective of those with some degree of visual impairment. I don't have a strong opinion about the specific larger symbol that should be used instead, but either of "$" or "^" would be perfectly fine by me. I'd be on-board with anything that doesn't have a strong existing purpose in the language. 3) Regarding the 6 before vs. after examples provided by Guido, I have some thoughts on the last one:
Original:
def flatten(self) -> Rhs: # If it's a single parenthesized group, flatten it. rhs = self.rhs if ( not self.is_loop() and len(rhs.alts) == 1 and len(rhs.alts[0].items) == 1 and isinstance(rhs.alts[0].items[0].item, Group) ): rhs = rhs.alts[0].items[0].item.rhs return rhs
Converted (note that I had to name the classes Alt and NamedItem, which are anonymous in the original):
def flatten(self) -> Rhs: # If it's a single parenthesized group, flatten it. rhs = self.rhs if not self.is_loop(): match rhs.alts: case [Alt(items=[NamedItem(item=Group(rhs=r))])]: rhs = r return rhs
I think part of it is just that I tend to find anything that has 4+ layers deep of nested parentheses and/or brackets to be a bit difficult to mentally parse, but my reaction to seeing something like "case [Alt(items=[NamedItem(item=Group(rhs=r))])]:" in the wild without anything to compare it to would probably be o_0. I definitely find the 4-part conditional in the "Original" version to be a lot easier to quickly understand, even if it's a bit redundant and requires some guard checks. So IMHO, that specific example isn't particularly convincing. That being said, I found the other 5 examples to be very easy to understand, with the second one being the one to really win me over. The proposed class matching is a drastic improvement over a massive wall of "if/elif isinstance(...):" conditionals, and I really like the way it lines up visually with the constants. Also, in time, I could very well change my mind about the last example after getting more used to the proposed syntax. On Tue, Jun 23, 2020 at 12:04 PM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/