PEP 622: Structural Pattern Matching
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin. Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-). I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst (*) The published version will hopefully be available soon. I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes! I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself: # Pattern Matching This repo contains a draft PEP proposing a `match` statement. Origins ------- The work has several origins: - Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi. Implementation -------------- A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)). Examples -------- Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo. Tutorial -------- A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful. The simplest form compares a target value against one or more literals: ```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ``` Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match. You can combine several literals in a single pattern using `|` ("or"): ```py case 401|403|404: return "Not allowed" ``` Patterns can look like unpacking assignments, and can be used to bind variables: ```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ``` Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`. If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables: ```py from dataclasses import dataclass @dataclass class Point: x: int y: int def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ``` We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable): ```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ``` Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this: ```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ``` We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated: ```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ``` Several other key features: - Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.) - Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items. - Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.) - Subpatterns may be extracted using the walrus (`:=`) operator: ```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ``` - Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction: ```py RED, GREEN, BLUE = 0, 1, 2 match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ``` - Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 2020-06-23 17:01, Guido van Rossum wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example
[Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html),
[F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft
PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
Why are: case ._: not OK? In this: case .BLACK: ... case BLACK: ... the first matches the value against 'BLACK' and the second succeeds and binds the value to 'BLACK'. Replacing 'BLACK' with '_': case ._: ... case _: ... I'd expect something similar, except for the binding part. I think the same could be said for: case Color.BLACK: and: case _.BLACK: I also wonder whether "or" would be clearer than "|": case .BLACK or Color.BLACK:
(I'm replying to several messages in one reply. But there is too much to respond to so this is merely the first batch.) On Tue, Jun 23, 2020 at 10:10 AM MRAB <python@mrabarnett.plus.com> wrote:
Why are:
case ._:
not OK?
In this:
case .BLACK: ... case BLACK: ...
the first matches the value against 'BLACK' and the second succeeds and binds the value to 'BLACK'.
Replacing 'BLACK' with '_':
case ._: ... case _: ...
I'd expect something similar, except for the binding part.
I think the same could be said for:
case Color.BLACK:
and:
case _.BLACK:
We disallow ._ and _.xxx because when plain _ is used as a pattern it is a *wildcard*, not an identifier. The pattern compiler must special-case _ as a target because [x, x] is an invalid pattern (you can't bind the same variable twice) but [_, _] is valid (it means "any non-string sequence of two elements"). There are some other edge cases where _ is special-cased too. But basically we want to prevent users doing a double take when they see _ used as a wildcard on one line and as a variable on the next.
I also wonder whether "or" would be clearer than "|":
case .BLACK or Color.BLACK:
That's just bikeshedding. I personally don't think it would be clearer. Other languages with pattern matching tend to use "|" (as do regular expressions :-). On Tue, Jun 23, 2020 at 10:28 AM Antoine Pitrou <solipsis@pitrou.net> wrote:
* What you call "Constant Value Patterns" can really refer to any local or non-local name, regardless of how complex the referred object is, right? Also, __eq__ is called in that case, not __match__?
Yeah, we're considering renaming these to "Value Patterns". There's nothing particularly constant about them. And yes, they use __eq__. Basically these and literals are processed exactly the same once the value is obtained.
* If I understand correctly, these:
case b"": print("it's an empty bytes object")
and
case bytes(): print("it's a bytes object")
have entirely different meanings. Am I right? This sounds like I have to switch contexts when reading code, based on whether I am reading regular code or a match clause, given that semantics are quite different.
Yes, you're right. The first is a literal pattern, the second a class pattern. Switching how you interpret code when reading it based on context is common -- seeing "x, y" on the LHS of an assignment is different than on the RHS, and seeing "x=5" in a "def" line is completely different from seeing it in a call.
Instead, it seems like the latter would be more explicitly spelled out as, e.g.:
case instanceof(bytes): print("it's a bytes object")
But that's arbitrary syntax. The beauty of using SomeClass() is that the pattern (if you squint :-) looks like a constructor for an object. SomeClass() is just an edge case of the general form SomeClass(subpattern1, subpattern2, ..., arg5=subp5, arg6=subp6, ...).
* The acronym "ADT" is never defined.
Yes it is, you missed this sentence: A design pattern where a group of record-like classes is combined into a union is popular in other languages that support pattern matching and is known under a name of algebraic data types [2]_ or ADTs.
* """If there are more positional items than the length of __match_args__, an ImpossibleMatchError is raised."""
What if there are less positional items than ``len(__match_args__)``? Can the match succeed rather than raise ImpossibleMatchError? This seems potentially error-prone.
Yes, it can succeed. Honestly, we could have gone the other way on this one, but we figured that there are plenty of functions and class constructors that can be called with a variable number of positional arguments, since some arguments have default values. We even invented an optional attribute __match_args_required__ (an int) that would have given the minimal number of positional arguments required, but it was deemed too obscure (and the name is ugly). Also we definitely wanted to be able to write `case Point():` for the isinstance check.
Overall, my main concern with this PEP is that the matching semantics and pragmatics are different from everything else in the language. When reading and understanding a match clause, there's a cognitive overhead because suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...). Obviously, there are cases where this is worthwhile, but still.
Quite a few other languages have done this and survived. And Python already has LVALUES and RVALUES that look the same but have different meanings. (In fact the pattern syntax for sequences was derived from those.)
It may be useful to think about different notations for these new things, rather than re-use the object construction notation.
For example:
case Point with (x, y): print(f"Got a point with x={x}, y={y}")
or:
case Point @ (x, y): print(f"Got a point with x={x}, y={y}")
(yes, "@" is the matrix multiplication operator... but it's probably much less likely to appear in common code and especially with a class object at the left)
Believe me, we did plenty of bikeshedding in private. But if one of your proposals gets overwhelming support we can revisit this. On Tue, Jun 23, 2020 at 11:41 AM Ethan Furman <ethan@stoneleaf.us> wrote:
Testing my understanding -- the following snippet from the PEP
match group_shapes(): case [], [point := Point(x, y), *other]:
will succeed if group_shapes() returns two lists, the first one being empty and the second one starting with a Point() ?
Correct. And it binds four variables: point, x, y, and other.
--- Runtime Specifications
The __match__ protocol --- Suffers from several indentation errors (the nested lists are not).
I don't see this any more. Maybe someone already fixed it?
-------------------------------------------------------------------------
My biggest complaint is the use of
case _:
Unless I'm missing something, every other control flow statement in Python that can have an "else" type branch uses "else" to denote it:
if/else
for/else
while/else
try/except/else
Since "else" perfectly sums up what's happening, why "case _" ?
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else")
Because it's not needed. In all those cases you mention the else clause provides a feature that could not be expressed otherwise. But "case _:" is going to work regardless, so we might as well use it. Rust, Scala and F# do this too. On Tue, Jun 23, 2020 at 12:07 PM Barry Warsaw <barry@python.org> wrote:
Couldn’t you adopt a flat indentation scheme with the minor change of moving the expression into a `match:` clause? E.g.
match: expression case a: foo() case b: bar() else: baz()
I didn’t see that in the rejected alternatives.
We discussed it, but ultimately rejected it because the first block would be a novelty in Pythonic syntax: an indented block whose content is a single expression rather than a sequence of statements. Do you think we need to add this to the Rejected Ideas section?
I’m with others who think that `else:` would be a better choice than `case _:`. Given my background in i18n (see the flufl.i18n library, etc.), my red flags go up when I see bare underscores being given syntactic meaning.
Ah, but bare _ has meaning throughout patterns -- it's a wildcard. This follows the convention (common outside i18n work) that _ is a throwaway target, e.g. for x, _, _ in points: print(x). For example, the restriction against `case _.a:` *could* interact badly
with my library. There, _ is the most common name binding for an object that implements the translation semantics. Therefore, it has attributes. I can’t think of a concrete example right now, but e.g. what if I wanted to match against `_.code`? That wouldn’t be legal if I’m understanding correctly (`code` being an attribute of the object typically bound to _).
Correct. You'd have to create an alias f = _ before entering the match and then write `case f.code`. MRAB brought this up too -- honestly if there's enough support for allowing ._ and _. we could easily allow it, it's currently just forbidden to prevent user confusion, not for any deep reason.
I’m also concerned about the .BLACK vs BLACK example. I get why the distinction is there, and I get that the PEP proposes that static analyzers help the ambiguity, but it strikes me as a potential gotcha and a source of mysterious errors.
That's definitely a possibility (and some of the PEP's authors, being the first people playing with a working implementation, have already experienced this). In fact this is one of the most debated issues for this PEP, and no solution is entirely satisfactory. (My personal favorite was checking the first letter of undotted names -- if it's a lowercase letter it's a variable to be bound, if it's Uppercase it's a value to be loaded. This follows PEP 8 recommendations for naming constants.)
Why not just bare `*` instead of `*_` in patterns?
Because we're trying to make sequence patterns look like sequence unpacking assignments, and they support *rest. And in mapping patterns we support **rest.
The PEP is unclear about what kind of method __match__() is. As I was reading along, I suspected it must be a static or class method, explicitly cannot be an instance method because the class in the case statement is never instantiated, but it’s not until I got to the object.__match__() discussion that this distinction was made clear. I think the PEP should just be explicit upfront about that.
Good point. I'll add a clarification.
As a side note, I’m also concerned that `case Point(x, y)` *looks* like it instantiates `Point` and that it’s jarring to Python developers that they have to mentally switch models when reading that code.
Other languages using the same convention (e.g. Scala) seem to have no problem with this. Basically we'll all have to learn that what comes after "case" is *not* an expression. (Just like you have to learn that inside f-strings {...} is an interpolation, not a dictionary. :-)
I was also unclear about that __match__() had to return an object until much later in the PEP. My running notes asked:
# Is returning None better than raising an exception? This won’t work: class C: def __init__(self, x=None): self.x = x @staticmethod def __match__(obj): # Should just return obj? return obj.x
match C(): case x: print(x)
But once I read on, I realized that __match__() should return `obj` in this case. Still, the fact that returning None to signal a case arm not matching feels like there’s a gotcha lurking in there somewhere.
Hm, there's a whole section on the result value of __match__, but perhaps it came too late. Brett also found the description of __match__ hard to read -- I will try to add an example and mention ahead of time that __match__ must return an object or None.
Should @sealed be in a separate PEP? It’s relevant to the discussion in 622, but seems like it would have use outside of the match feature so should possibly be proposed separately.
Hm, that would be a very short PEP, and it's really not all that useful without a match statement.
The PEP is unclear whether `case` is also a soft keyword. I’m guessing it must be.
Yes, will clarify. (Though it is mentioned under Backwards Compatibility. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 06/23/2020 04:26 PM, Guido van Rossum wrote:
On Tue, Jun 23, 2020 at 11:41 AM Ethan Furman wrote:
Testing my understanding -- the following snippet from the PEP
match group_shapes(): case [], [point := Point(x, y), *other]:
will succeed if group_shapes() returns two lists, the first one being empty and the second one starting with a Point() ?
Correct. And it binds four variables: point, x, y, and other.
Okay, so following that example some more: `other` is all the other items in `group_shape`'s second list `x` and `y` are the x,y values from the second list's first element (which is a Point) `point` is... the first element of the second list? Or a __match__ object? -- ~Ethan~
On Tue, Jun 23, 2020 at 5:21 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/23/2020 04:26 PM, Guido van Rossum wrote:
On Tue, Jun 23, 2020 at 11:41 AM Ethan Furman wrote:
Testing my understanding -- the following snippet from the PEP
match group_shapes(): case [], [point := Point(x, y), *other]:
will succeed if group_shapes() returns two lists, the first one being empty and the second one starting with a Point() ?
Correct. And it binds four variables: point, x, y, and other.
Okay, so following that example some more:
`other` is all the other items in `group_shape`'s second list
`x` and `y` are the x,y values from the second list's first element (which is a Point)
`point` is... the first element of the second list? Or a __match__ object?
The former. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 24/06/20 11:26 am, Guido van Rossum wrote:
A design pattern where a group of record-like classes is combined into a union is popular in other languages that support pattern matching and is known under a name of algebraic data types [2]_ or ADTs.
Whoo, that's confusing! I read "ADT" as "Abstract Data Type" and totally missed that you were using it for something different. I think it would be better to just call them "algebraic types" (which is the term I learned) and not use an acronym. -- Greg
On Wed, Jun 24, 2020 at 5:30 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 24/06/20 11:26 am, Guido van Rossum wrote:
A design pattern where a group of record-like classes is combined into a union is popular in other languages that support pattern matching and is known under a name of algebraic data types [2]_ or ADTs.
Whoo, that's confusing! I read "ADT" as "Abstract Data Type" and totally missed that you were using it for something different.
I think it would be better to just call them "algebraic types" (which is the term I learned) and not use an acronym.
You're right, that *is* confusing. We'll fix it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Jun 23, 2020, at 16:26, Guido van Rossum <guido@python.org> wrote:
(I'm replying to several messages in one reply. But there is too much to respond to so this is merely the first batch.)
I’m really glad to know that the PEP 622 team is deliberating internally about all the suggestions and comments made on this thread. Thank you, and I look forward to the updated PEP.
On Tue, Jun 23, 2020 at 12:07 PM Barry Warsaw <barry@python.org> wrote: Couldn’t you adopt a flat indentation scheme with the minor change of moving the expression into a `match:` clause? E.g.
match: expression case a: foo() case b: bar() else: baz()
I didn’t see that in the rejected alternatives.
I’ll just answer the one question Guido asked (Apple Mail didn’t seem to quote it properly):
We discussed it, but ultimately rejected it because the first block would be a novelty in Pythonic syntax: an indented block whose content is a single expression rather than a sequence of statements. Do you think we need to add this to the Rejected Ideas section?
Yes, I think with some short rationale, it will short-circuit this question in subsequent discussions. Cheers, -Barry
On Tue, Jun 23, 2020 at 10:10 AM MRAB <python@mrabarnett.plus.com> wrote:
Why are:
case ._:
not OK?
In this:
case .BLACK: ... case BLACK: ...
the first matches the value against 'BLACK' and the second succeeds and binds the value to 'BLACK'.
Replacing 'BLACK' with '_':
case ._: ... case _: ...
I'd expect something similar, except for the binding part.
I think the same could be said for:
case Color.BLACK:
and:
case _.BLACK:
The PEP authors discussed this last night and (with a simple majority) we agreed that this restriction isn't all that important, so we're dropping it. https://github.com/python/peps/commit/410ba6dd4841dc445ce8e0cd3e63ade2fa92dd... (We're considering other feedback carefully, but most of it require more deliberation.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Some comments: * What you call "Constant Value Patterns" can really refer to any local or non-local name, regardless of how complex the referred object is, right? Also, __eq__ is called in that case, not __match__? * If I understand correctly, these: case b"": print("it's an empty bytes object") and case bytes(): print("it's a bytes object") have entirely different meanings. Am I right? This sounds like I have to switch contexts when reading code, based on whether I am reading regular code or a match clause, given that semantics are quite different. Instead, it seems like the latter would be more explicitly spelled out as, e.g.: case instanceof(bytes): print("it's a bytes object") * The acronym "ADT" is never defined. * """If there are more positional items than the length of __match_args__, an ImpossibleMatchError is raised.""" What if there are less positional items than ``len(__match_args__)``? Can the match succeed rather than raise ImpossibleMatchError? This seems potentially error-prone. Overall, my main concern with this PEP is that the matching semantics and pragmatics are different from everything else in the language. When reading and understanding a match clause, there's a cognitive overhead because suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...). Obviously, there are cases where this is worthwhile, but still. It may be useful to think about different notations for these new things, rather than re-use the object construction notation. For example: case Point with (x, y): print(f"Got a point with x={x}, y={y}") or: case Point @ (x, y): print(f"Got a point with x={x}, y={y}") (yes, "@" is the matrix multiplication operator... but it's probably much less likely to appear in common code and especially with a class object at the left) Regards Antoine. On Tue, 23 Jun 2020 09:01:11 -0700 Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
On 2020-06-23 18:20, Antoine Pitrou wrote:
Some comments:
* What you call "Constant Value Patterns" can really refer to any local or non-local name, regardless of how complex the referred object is, right? Also, __eq__ is called in that case, not __match__?
* If I understand correctly, these:
case b"": print("it's an empty bytes object")
and
case bytes(): print("it's a bytes object")
have entirely different meanings. Am I right? This sounds like I have to switch contexts when reading code, based on whether I am reading regular code or a match clause, given that semantics are quite different.
Instead, it seems like the latter would be more explicitly spelled out as, e.g.:
case instanceof(bytes): print("it's a bytes object")
* The acronym "ADT" is never defined.
* """If there are more positional items than the length of __match_args__, an ImpossibleMatchError is raised."""
What if there are less positional items than ``len(__match_args__)``? Can the match succeed rather than raise ImpossibleMatchError? This seems potentially error-prone.
Overall, my main concern with this PEP is that the matching semantics and pragmatics are different from everything else in the language. When reading and understanding a match clause, there's a cognitive overhead because suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...). Obviously, there are cases where this is worthwhile, but still.
It may be useful to think about different notations for these new things, rather than re-use the object construction notation.
For example:
case Point with (x, y): print(f"Got a point with x={x}, y={y}")
or:
case Point @ (x, y): print(f"Got a point with x={x}, y={y}")
(yes, "@" is the matrix multiplication operator... but it's probably much less likely to appear in common code and especially with a class object at the left)
Or: case Point as (x, y): print(f"Got a point with x={x}, y={y}") perhaps as 'as' is already used for binding. The disadvantage there is with nested patterns: case Coordinate(Point(x1, y1), Point(x2, y2)): where you're matching a coordinate and binding to x1, y1, x2 and y2.
On 24/06/20 5:20 am, Antoine Pitrou wrote:
suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...).
This is one reason I would rather see something explicitly marking names to be bound, rather than making the binding case the default. E.g. case Point(?x, 0): This would also eliminate the need for the awkward leading-dot workaround for names to be looked up rather than bound. The PEP says this was rejected on the grounds that binding is more common than matching constants, but code is read more often than it is written, and readability counts. It would also help with the problem that in case Spam(foo = blarg): the name being bound is on the right, whereas in case Spam(foo := Blarg()): the name being bound is on the left. I found myself having to think quite hard while reading the PEP to keep these two straight. With explicit marking of bound names, they would be case Spam(foo = ?blarg): case Spam(?foo := Blarg()): which, particularly in the first case, would make it much clearer what's being bound. One other thing that the PEP doesn't make clear -- is it possible to combine '=' and ':=' to match a keyword argument with a sub pattern and capture the result? I.e. can you write case Spam(foo = foo_value := Blarg()): ? -- Greg Obviously, there are cases where this is
worthwhile, but still.
It may be useful to think about different notations for these new things, rather than re-use the object construction notation.
For example:
case Point with (x, y): print(f"Got a point with x={x}, y={y}")
or:
case Point @ (x, y): print(f"Got a point with x={x}, y={y}")
(yes, "@" is the matrix multiplication operator... but it's probably much less likely to appear in common code and especially with a class object at the left)
Regards
Antoine.
On Tue, 23 Jun 2020 09:01:11 -0700 Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MVQJA7LD... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, 24 Jun 2020 21:54:24 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 24/06/20 5:20 am, Antoine Pitrou wrote:
suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...).
This is one reason I would rather see something explicitly marking names to be bound, rather than making the binding case the default. E.g.
case Point(?x, 0):
This would also eliminate the need for the awkward leading-dot workaround for names to be looked up rather than bound.
That looks quite a bit better indeed, because it strongly suggests that something unusual is happening from the language's POV. Thank you for suggesting this.
One other thing that the PEP doesn't make clear -- is it possible to combine '=' and ':=' to match a keyword argument with a sub pattern and capture the result? I.e. can you write
case Spam(foo = foo_value := Blarg()):
Yuck :-S Regards Antoine.
On 2020-06-24 13:37, Antoine Pitrou wrote:
On Wed, 24 Jun 2020 21:54:24 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 24/06/20 5:20 am, Antoine Pitrou wrote:
suddently `Point(x, 0)` means something entirely different (it doesn't call Point.__new__, it doesn't lookup `x` in the locals or globals...).
This is one reason I would rather see something explicitly marking names to be bound, rather than making the binding case the default. E.g.
case Point(?x, 0):
This would also eliminate the need for the awkward leading-dot workaround for names to be looked up rather than bound.
That looks quite a bit better indeed, because it strongly suggests that something unusual is happening from the language's POV. Thank you for suggesting this.
Could the name be omitted when you're not interested in the value? case Point(?, 0):
One other thing that the PEP doesn't make clear -- is it possible to combine '=' and ':=' to match a keyword argument with a sub pattern and capture the result? I.e. can you write
case Spam(foo = foo_value := Blarg()):
Yuck :-S
On Wed, Jun 24, 2020 at 2:56 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
One other thing that the PEP doesn't make clear -- is it possible to combine '=' and ':=' to match a keyword argument with a sub pattern and capture the result? I.e. can you write
case Spam(foo = foo_value := Blarg()):
?
The full grammar in the Appendix makes this clear -- you can't write it like that, but you could write case Spam(foo=(foo_value := Blarg()): I'll get to your other points eventually. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 06/23/2020 09:01 AM, Guido van Rossum wrote: Very nice! I am totally in favor (with some bike-shedding, of course). First, a formatting comment: A new sentence immediately following formatted code needs more space -- it looks like the same sentence otherwise. Will putting two spaces after the period help in this case? Testing my understanding -- the following snippet from the PEP match group_shapes(): case [], [point := Point(x, y), *other]: will succeed if group_shapes() returns two lists, the first one being empty and the second one starting with a Point() ? --- Runtime Specifications The __match__ protocol --- Suffers from several indentation errors (the nested lists are not). ------------------------------------------------------------------------- My biggest complaint is the use of case _: Unless I'm missing something, every other control flow statement in Python that can have an "else" type branch uses "else" to denote it: if/else for/else while/else try/except/else Since "else" perfectly sums up what's happening, why "case _" ? match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else") -- ~Ethan~
On Tue, 23 Jun 2020 at 17:07, Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
I'm just going to say that I *really* like the look of this proposal. I suspect there's going to be a fairly extensive, and detailed, discussion over some of the edge cases, but they can be worked out. Overall, though, I love the general idea. I'll hold off on saying anything more until I've read the PEP properly. In particular, the "Deferred Ideas" section seems to cover a lot of my initial "what about X" questions. Paul
I will say that trying to follow https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio... was really hard. Any chance of getting some pseudo-code that shows how a match is performed? Otherwise all of that wording tries so hard to be a spec that I found it hard to follow in my head in how things function. For instance, "When __match_args__ is missing (as is the default) or None, a single positional sub-pattern is allowed to be passed to the call" is really misleading as it seems that a "sub-pattern" in this case is just going to be a constant like `[1, 2, 3]`. Otherwise how does `["<"|">"]` or `[1, 2, *_]` get represented as a "single positional sub-pattern" (if either of those examples is possible)? The use of the term "sub-pattern" feels misleading because while you may consider even constant patterns a "pattern", going that generic feels like any pattern should fit in that definition when in fact it seems to only be an object where a direct equality check is done. It seems the way things work is basically: 1. `__match__(obj)` returns a proxy object to have Python match against; it is passed in the thing that `match` is running against, returning `None` if it know there's no chance a match will work 2. If `__match_args__` is present, then it is used to map positional arguments in the pattern to attributes on the proxy object 3. From there the `match` functionality does a bunch of comparisons against attributes on the proxy object to see if the match works Is that right? That suggests all the work in implementing this for objects is coming up with a way to serialize an object to a proxy that makes pattern matching possible. One thing I see mentioned in examples but not in the `__match__` definitions is how mappings work. Are you using `__match_args__` to map keys to attributes? Or are you using `__getitem__` and that just isn't directly mentioned? Otherwise the section on how `__match__` is used only mentioned attributes and never talks about keys.
On Jun 23, 2020, at 09:01, Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Congratulations to the authors. This is a well written, complex PEP for a powerful feature. Here are some of my initial comments/questions. Couldn’t you adopt a flat indentation scheme with the minor change of moving the expression into a `match:` clause? E.g. match: expression case a: foo() case b: bar() else: baz() I didn’t see that in the rejected alternatives. I’m with others who think that `else:` would be a better choice than `case _:`. Given my background in i18n (see the flufl.i18n library, etc.), my red flags go up when I see bare underscores being given syntactic meaning. For example, the restriction against `case _.a:` *could* interact badly with my library. There, _ is the most common name binding for an object that implements the translation semantics. Therefore, it has attributes. I can’t think of a concrete example right now, but e.g. what if I wanted to match against `_.code`? That wouldn’t be legal if I’m understanding correctly (`code` being an attribute of the object typically bound to _). I’m also concerned about the .BLACK vs BLACK example. I get why the distinction is there, and I get that the PEP proposes that static analyzers help the ambiguity, but it strikes me as a potential gotcha and a source of mysterious errors. Why not just bare `*` instead of `*_` in patterns? The PEP is unclear about what kind of method __match__() is. As I was reading along, I suspected it must be a static or class method, explicitly cannot be an instance method because the class in the case statement is never instantiated, but it’s not until I got to the object.__match__() discussion that this distinction was made clear. I think the PEP should just be explicit upfront about that. As a side note, I’m also concerned that `case Point(x, y)` *looks* like it instantiates `Point` and that it’s jarring to Python developers that they have to mentally switch models when reading that code. I was also unclear about that __match__() had to return an object until much later in the PEP. My running notes asked: # Is returning None better than raising an exception? This won’t work: class C: def __init__(self, x=None): self.x = x @staticmethod def __match__(obj): # Should just return obj? return obj.x match C(): case x: print(x) But once I read on, I realized that __match__() should return `obj` in this case. Still, the fact that returning None to signal a case arm not matching feels like there’s a gotcha lurking in there somewhere. Should @sealed be in a separate PEP? It’s relevant to the discussion in 622, but seems like it would have use outside of the match feature so should possibly be proposed separately. The PEP is unclear whether `case` is also a soft keyword. I’m guessing it must be. Those are my current thoughts on first read through. Cheers, -Barry
On 23/06/2020 17:01, Guido van Rossum wrote:
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed"
The PEP is great, but this strikes me as horribly confusing, given that 401|403|404 is already legal syntax. IIUC any legal expression can come between `case` and `:`, but expressions that contain `|` at their outermost level are *interpreted differently* than from in other contexts. Presumably adding parentheses: case (401|403|404): would make it equivalent to case 407: Is a separator (other than whitespace) actually needed? Can the parser cope with case 401 403 404: Failing that IMO preferable, albeit not ideal, possibilities would be 1) Use colon as the separator. 2) Use comma as the separator - this is already legal syntax too, but IMO it reads more naturally. (And IIRC there are already contexts where brackets are necessary to indicate a tuple.) Perhaps someone can think of something better. I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes. Rob Cliffe
On Wed, Jun 24, 2020 at 5:30 AM Rob Cliffe via Python-Dev <python-dev@python.org> wrote:
The PEP is great, but this strikes me as horribly confusing, given that 401|403|404 is already legal syntax. IIUC any legal expression can come between `case` and `:`, but expressions that contain `|` at their outermost level are interpreted differently than from in other contexts. Presumably adding parentheses: case (401|403|404): would make it equivalent to case 407:
Is a separator (other than whitespace) actually needed? Can the parser cope with case 401 403 404:
Failing that IMO preferable, albeit not ideal, possibilities would be 1) Use colon as the separator. 2) Use comma as the separator - this is already legal syntax too, but IMO it reads more naturally. (And IIRC there are already contexts where brackets are necessary to indicate a tuple.) Perhaps someone can think of something better.
I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes.
It's not really arbitrary expressions, though. It's more like an assignment target list, but with some handling of constants. case (x, y): is very similar to (x, y) = ... There is the definite risk of confusion with 'if' statements, because you can say "case 401|403|404:" but you can't say "if x == 401|403|404:", and people already try to do that (usually with the 'or' keyword). I think the risk is worth it, given the expressiveness gained, but I can see the line of argument that it's going to cause confusion. ChrisA
On 23/06/2020 20:35, Chris Angelico wrote:
On Wed, Jun 24, 2020 at 5:30 AM Rob Cliffe via Python-Dev <python-dev@python.org> wrote:
The PEP is great, but this strikes me as horribly confusing, given that 401|403|404 is already legal syntax. IIUC any legal expression can come between `case` and `:`, but expressions that contain `|` at their outermost level are interpreted differently than from in other contexts. Presumably adding parentheses: case (401|403|404): would make it equivalent to case 407:
Is a separator (other than whitespace) actually needed? Can the parser cope with case 401 403 404:
Failing that IMO preferable, albeit not ideal, possibilities would be 1) Use colon as the separator. 2) Use comma as the separator - this is already legal syntax too, but IMO it reads more naturally. (And IIRC there are already contexts where brackets are necessary to indicate a tuple.) Perhaps someone can think of something better.
I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes.
It's not really arbitrary expressions, though. It's more like an assignment target list, but with some handling of constants.
case (x, y):
is very similar to
(x, y) = ...
If arbitrary expressions are not allowed - the power of this new feature is reduced - we have to remember another set of rules about what is allowed and what isn't. Just as we did with decorator syntax - until that restriction was done away with.
On Jun 23, 2020, at 12:27, Rob Cliffe via Python-Dev <python-dev@python.org> wrote:
On 23/06/2020 17:01, Guido van Rossum wrote:
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed"
The PEP is great, but this strikes me as horribly confusing, given that 401|403|404 is already legal syntax. IIUC any legal expression can come between `case` and `:`, but expressions that contain `|` at their outermost level are interpreted differently than from in other contexts. Presumably adding parentheses: case (401|403|404): would make it equivalent to case 407:
Is a separator (other than whitespace) actually needed? Can the parser cope with case 401 403 404:
Maybe the PEP could support this syntax: case in 401, 403, 404: That seems like it would be both unambiguous and semantically obvious. Cheers, -Barry
On Jun 23, 2020, at 13:50, Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/23/2020 12:49 PM, Barry Warsaw wrote:
Maybe the PEP could support this syntax: case in 401, 403, 404: That seems like it would be both unambiguous and semantically obvious.
Unfortunately, that means:
Is the object a three-tuple with the values 401, 403, and 404?
It would mean: is the object *in* a 3-tuple with values 401, 403, and 404, right? -Barry
On 06/23/2020 02:48 PM, Barry Warsaw wrote:
On Jun 23, 2020, at 13:50, Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/23/2020 12:49 PM, Barry Warsaw wrote:
Maybe the PEP could support this syntax: case in 401, 403, 404: That seems like it would be both unambiguous and semantically obvious.
Unfortunately, that means:
Is the object a three-tuple with the values 401, 403, and 404?
It would mean: is the object *in* a 3-tuple with values 401, 403, and 404, right?
If I understand the PEP correctly, then some_obj == (401, 403, 404) match some_obj: case 401, 403, 404: # this matches -- ~Ethan~
On 2020-06-23 20:27, Rob Cliffe via Python-Dev wrote:
On 23/06/2020 17:01, Guido van Rossum wrote:
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed"
The PEP is great, but this strikes me as horribly confusing, given that 401|403|404 is already legal syntax. IIUC any legal expression can come between `case` and `:`, but expressions that contain `|` at their outermost level are *interpreted differently* than from in other contexts.
The grammar shows that 'case' is followed by a series of alternatives, separated by '|', but the alternatives aren't expressions, basically only literals and instances of a given class. You can't say "case 1 + 2:", for example.
Presumably adding parentheses: case (401|403|404): would make it equivalent to case 407:
I'm not sure whether that's legal, but it's not equivalent.
Is a separator (other than whitespace) actually needed? Can the parser cope with case 401 403 404:
Failing that IMO preferable, albeit not ideal, possibilities would be 1) Use colon as the separator. 2) Use comma as the separator - this is already legal syntax too, but IMO it reads more naturally. (And IIRC there are already contexts where brackets are necessary to indicate a tuple.) Perhaps someone can think of something better.
I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes.
I think that's done for consistency. '_' is a wildcard and you can have: case (_, _): to match any 2-tuple, so: case _: would match any value, and can thus already serve as the default. I wouldn't object to 'else', though.
I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes.
I think that's done for consistency. '_' is a wildcard and you can have:
case (_, _):
to match any 2-tuple, so:
case _:
would match any value, and can thus already serve as the default.
Consistency with what? Where else is `_` currently used as a wildcard?
On Tue, Jun 23, 2020 at 10:59 PM Rob Cliffe via Python-Dev < python-dev@python.org> wrote:
I also (with others) prefer `else:` or perhaps `case else:` to using the`_` variable. The latter is obscure, and woudn't sit well with code that already uses that variable for its own purposes.
I think that's done for consistency. '_' is a wildcard and you can have:
case (_, _):
to match any 2-tuple, so:
case _:
would match any value, and can thus already serve as the default.
Consistency with what? Where else is `_` currently used as a wildcard?
Calling it a wildcard for the purposes of matching is more about using the terminology someone who wants to match a thing would search for. In other contexts it's called a throwaway, but it serves the exact same purpose. Calling it a throwaway would confuse people who don't know how language features are linked, but just know they want a catchall/wildcard. I wonder if it's time to officially designate _ as a reserved name.
On Tue, Jun 23, 2020 at 11:27 PM Emily Bowman <silverbacknet@gmail.com> wrote:
I wonder if it's time to officially designate _ as a reserved name.
Alas, it's too late for that. The i18n community uses _("message text") to mark translatable text. You may want to look into the gettext stdlib module. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Great timing! Last week I was trying to emulate a certain Rust example in Python. Rust has a way to implement families of classes without subclassing, which I think could be a great addition to Python someday. I'll explain below. Here is a Rust example (https://youtu.be/WDkv2cKOxx0?t=3795) that demonstrates a way to implement classes without subclassing. # Rust Code enum Shape { Circle(f32), Square(f32), Rectangle(f32, f32) } impl Shape { fn area(self) -> f32 { match self { Shape::Circle(r) => 3.1 * r * r, Shape::Square(l) => l * l, Shape::Rectangle(l, w) => l * w } } } fn main () { let c = Shape::Circle(3.0); let s = Shape::Square(3.0); let r = Shape::Rectangle(3.0, 7.5); println!("○ {}", c.area()); println!("□ {}", s.area()); println!("▭ {}", r.area()); } # Output ○ 27.899998 □ 9 ▭ 22.5 The general idea is: - declare classes that share a certain type - extend common methods with a match-case style That's it. No sub-classing required. I wondered if someday, can we do this in Python? This match-case proposal seems to fit well in this example. While I know this PEP does not focus on detailed applications, does anyone believe we can achieve elegant class creation (sans subclassing) with match-cases? On Wed, Jun 24, 2020 at 2:04 PM Guido van Rossum <guido@python.org> wrote:
On Tue, Jun 23, 2020 at 11:27 PM Emily Bowman <silverbacknet@gmail.com> wrote:
I wonder if it's time to officially designate _ as a reserved name.
Alas, it's too late for that. The i18n community uses _("message text") to mark translatable text. You may want to look into the gettext stdlib module.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/P6EKIKFP... Code of Conduct: http://python.org/psf/codeofconduct/
I declare this post Off Topic. Please open another thread (preferably on python-ideas). Everybody else, please don't respond to this particular post by the OP. On Wed, Jun 24, 2020 at 11:12 AM pylang <pylang3@gmail.com> wrote:
Great timing! Last week I was trying to emulate a certain Rust example in Python. Rust has a way to implement families of classes without subclassing, which I think could be a great addition to Python someday. I'll explain below.
Here is a Rust example (https://youtu.be/WDkv2cKOxx0?t=3795) that demonstrates a way to implement classes without subclassing.
# Rust Code enum Shape { Circle(f32), Square(f32), Rectangle(f32, f32) }
impl Shape { fn area(self) -> f32 { match self { Shape::Circle(r) => 3.1 * r * r, Shape::Square(l) => l * l, Shape::Rectangle(l, w) => l * w } } }
fn main () { let c = Shape::Circle(3.0); let s = Shape::Square(3.0); let r = Shape::Rectangle(3.0, 7.5); println!("○ {}", c.area()); println!("□ {}", s.area()); println!("▭ {}", r.area()); }
# Output ○ 27.899998 □ 9 ▭ 22.5
The general idea is: - declare classes that share a certain type - extend common methods with a match-case style
That's it. No sub-classing required.
I wondered if someday, can we do this in Python? This match-case proposal seems to fit well in this example.
While I know this PEP does not focus on detailed applications, does anyone believe we can achieve elegant class creation (sans subclassing) with match-cases?
On Wed, Jun 24, 2020 at 2:04 PM Guido van Rossum <guido@python.org> wrote:
On Tue, Jun 23, 2020 at 11:27 PM Emily Bowman <silverbacknet@gmail.com> wrote:
I wonder if it's time to officially designate _ as a reserved name.
Alas, it's too late for that. The i18n community uses _("message text") to mark translatable text. You may want to look into the gettext stdlib module.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/P6EKIKFP... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Tue, Jun 23, 2020 at 2:10 PM MRAB <python@mrabarnett.plus.com> wrote:
I think that's done for consistency. '_' is a wildcard and you can have:
case (_, _):
to match any 2-tuple, so:
case _:
would match any value, and can thus already serve as the default.
I wouldn't object to 'else', though.
Can you have case (x,x): ? I haven't tried the implementation, but it's not addressed in the PEP that I see, and if that's legal, then _ is effectively just a style choice, rather than a functional one, and there's no reason it shouldn't also be a named match. +1 for including "else:" for consistency even if that's just a shim for "case _:"
On 2020-06-23 22:53, Emily Bowman wrote:
On Tue, Jun 23, 2020 at 2:10 PM MRAB <python@mrabarnett.plus.com <mailto:python@mrabarnett.plus.com>> wrote:
I think that's done for consistency. '_' is a wildcard and you can have:
case (_, _):
to match any 2-tuple, so:
case _:
would match any value, and can thus already serve as the default.
I wouldn't object to 'else', though.
Can you have case (x,x): ? I haven't tried the implementation, but it's not addressed in the PEP that I see, and if that's legal, then _ is effectively just a style choice, rather than a functional one, and there's no reason it shouldn't also be a named match.
You cannot bind to the same name twice, so "case (x,x):"is an error.
+1 for including "else:" for consistency even if that's just a shim for "case _:"
On Tue, Jun 23, 2020 at 3:06 PM Emily Bowman <silverbacknet@gmail.com> wrote:
Can you have case (x,x): ? I haven't tried the implementation, but it's not addressed in the PEP that I see, and if that's legal, then _ is effectively just a style choice, rather than a functional one, and there's no reason it shouldn't also be a named match.
Good question. It's explicitly forbidden by the PEP, in the "Name pattern" section: While matching against each case clause, a name may be bound at most once, having two name patterns with coinciding names is an error. An exception is made for the special single underscore (``_``) name; in patterns, it's a wildcard that *never* binds:: match data: case [x, x]: # Error! ... case [_, _]: print("Some pair") print(_) # Error! Note: one can still match on a collection with equal items using `guards`_. Also, ``[x, y] | Point(x, y)`` is a legal pattern because the two alternatives are never matched at the same time. I should add that if you want to check for two values in different positions being equal, you need to use a guard: match data: case [x, y] if x == y: print("Two equal values") On Tue, Jun 23, 2020 at 12:11 PM Brett Cannon <brett@python.org> wrote:
I will say that trying to follow https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio... was really hard. Any chance of getting some pseudo-code that shows how a match is performed? Otherwise all of that wording tries so hard to be a spec that I found it hard to follow in my head in how things function.
Sorry about that. This section was subject to heavy editing recently and lost clarity. I will try to make it better! Writing it as pseudo code will take a little time, but I will give it a try.
For instance, "When __match_args__ is missing (as is the default) or None, a single positional sub-pattern is allowed to be passed to the call" is really misleading as it seems that a "sub-pattern" in this case is just going to be a constant like `[1, 2, 3]`. Otherwise how does `["<"|">"]` or `[1, 2, *_]` get represented as a "single positional sub-pattern" (if either of those examples is possible)? The use of the term "sub-pattern" feels misleading because while you may consider even constant patterns a "pattern", going that generic feels like any pattern should fit in that definition when in fact it seems to only be an object where a direct equality check is done.
It seems the way things work is basically:
1. `__match__(obj)` returns a proxy object to have Python match against; it is passed in the thing that `match` is running against, returning `None` if it know there's no chance a match will work 2. If `__match_args__` is present, then it is used to map positional arguments in the pattern to attributes on the proxy object 3. From there the `match` functionality does a bunch of comparisons against attributes on the proxy object to see if the match works
Is that right? That suggests all the work in implementing this for objects is coming up with a way to serialize an object to a proxy that makes pattern matching possible.
Yes, that's right, and the protocol was defined carefully so that the author of __match__ doesn't have to do any pattern matching -- all they have to do is produce an object that has the right attributes, and the interpreter does the rest. Note that it is __match__'s responsibility to check isinstance()! This is because __match__ may not want to use isinstance() but instead check for the presence of certain attributes -- IOW, the class pattern supports duck typing! (This was a little easter egg. :-)
One thing I see mentioned in examples but not in the `__match__` definitions is how mappings work. Are you using `__match_args__` to map keys to attributes? Or are you using `__getitem__` and that just isn't directly mentioned? Otherwise the section on how `__match__` is used only mentioned attributes and never talks about keys.
Oh, __match__ is *only* used for class patterns. Mapping patterns are done differently. They don't use __getitem__ exactly -- the PEP says Matched key-value pairs must already be present in the mapping, and not created on-the-fly by ``__missing__`` or ``__getitem__``. For example, ``collections.defaultdict`` instances will only match patterns with keys that were already present when the ``match`` block was entered. You shouldn't try to depend on exactly what methods will be called -- you should just faithfully implement the Mapping protocol. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 24/06/20 11:57 am, Guido van Rossum wrote:
Matched key-value pairs must already be present in the mapping, and not created on-the-fly by ``__missing__`` or ``__getitem__``. For example, ``collections.defaultdict`` instances will only match patterns with keys that were already present when the ``match`` block was entered.
Does that mean the pattern matching logic is in cahoots with collections.defaultdict? What if you want to match against your own defaultdict-like type? -- Greg
On Wed, Jun 24, 2020 at 4:05 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 24/06/20 11:57 am, Guido van Rossum wrote:
Matched key-value pairs must already be present in the mapping, and not created on-the-fly by ``__missing__`` or ``__getitem__``. For example, ``collections.defaultdict`` instances will only match patterns with keys that were already present when the ``match`` block was entered.
Does that mean the pattern matching logic is in cahoots with collections.defaultdict? What if you want to match against your own defaultdict-like type?
IIUC the pattern matching uses either .get(key, <sentinel>) or .__contains__(key) followed by .__getitem__(key). Neither of those will auto-add the item to a defaultdict (and the Mapping protocol supports both). @Brandt: what does your implementation currently do? Do you think we need to specify this in the PEP? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Guido van Rossum wrote:
IIUC the pattern matching uses either .get(key, <sentinel>) or .__contains__(key) followed by .__getitem__(key). Neither of those will auto-add the item to a defaultdict (and the Mapping protocol supports both). @Brandt: what does your implementation currently do? Do you think we need to specify this in the PEP?
I still prefer under-specifying rather than over-specifying, in order to allow the implementation some flexibility. Basically, "if you've got a funny mapping, it may behave funny". For example, we currently perform a length check before even getting to this point, which is allowed because the spec isn't very strict. This is a good thing; we already have a section in the PEP that explicitly allows compiler to perform transformations such as C(x) | C(y) -> C(x | y). If we have to specify it, __contains__ followed by __getitem__ is the way to do it. The current behavior is subtly different (and probably wrong), but I was going to change it today anyways... I had an outstanding TODO for it.
On Wed, Jun 24, 2020 at 11:05 AM Brandt Bucher <brandtbucher@gmail.com> wrote:
Guido van Rossum wrote:
IIUC the pattern matching uses either .get(key, <sentinel>) or .__contains__(key) followed by .__getitem__(key). Neither of those will auto-add the item to a defaultdict (and the Mapping protocol supports both). @Brandt: what does your implementation currently do? Do you think we need to specify this in the PEP?
I still prefer under-specifying rather than over-specifying, in order to allow the implementation some flexibility. Basically, "if you've got a funny mapping, it may behave funny".
For example, we currently perform a length check before even getting to this point, which is allowed because the spec isn't very strict. This is a good thing; we already have a section in the PEP that explicitly allows compiler to perform transformations such as C(x) | C(y) -> C(x | y).
If we have to specify it, __contains__ followed by __getitem__ is the way to do it. The current behavior is subtly different (and probably wrong), but I was going to change it today anyways... I had an outstanding TODO for it.
So why not .get(key, <sentinel>)? You can reuse the sentinel, and this way it's a single call instead of two -- e.g. the code in Mapping implements both __contains__() and get() by calling __getitem__() and catching KeyError. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Guido van Rossum wrote:
So why not .get(key, <sentinel>)? You can reuse the sentinel, and this way it's a single call instead of two -- e.g. the code in Mapping implements both __contains__() and get() by calling __getitem__() and catching KeyError.
Good point. Another option I've considered is using the `keys` method, since most non-dict mappings would get this called anyways for patterns with `**rest`. All the more reason why we should allow the implementation some flexibility. ;)
On 25/06/20 6:36 am, Brandt Bucher wrote:
Another option I've considered is using the `keys` method, since most non-dict mappings would get this called anyways for patterns with `**rest`.
All the more reason why we should allow the implementation some flexibility. ;)
My concern was only that it might be using some private feature of collections.defaultdict to get the specified behaviour. But if it's only using the standard mapping protocol, I don't think it matters exactly how it's using it. -- Greg
On Tue, Jun 23, 2020 at 5:08 PM Guido van Rossum <guido@python.org> wrote:
[SNIP] On Tue, Jun 23, 2020 at 12:11 PM Brett Cannon <brett@python.org> wrote:
I will say that trying to follow https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio... was really hard. Any chance of getting some pseudo-code that shows how a match is performed? Otherwise all of that wording tries so hard to be a spec that I found it hard to follow in my head in how things function.
Sorry about that. This section was subject to heavy editing recently and lost clarity. I will try to make it better! Writing it as pseudo code will take a little time, but I will give it a try.
I leave the decision about pseudocode up to you if you think you can just find words to clarify that section.
For instance, "When __match_args__ is missing (as is the default) or None, a single positional sub-pattern is allowed to be passed to the call" is really misleading as it seems that a "sub-pattern" in this case is just going to be a constant like `[1, 2, 3]`. Otherwise how does `["<"|">"]` or `[1, 2, *_]` get represented as a "single positional sub-pattern" (if either of those examples is possible)? The use of the term "sub-pattern" feels misleading because while you may consider even constant patterns a "pattern", going that generic feels like any pattern should fit in that definition when in fact it seems to only be an object where a direct equality check is done.
It seems the way things work is basically:
1. `__match__(obj)` returns a proxy object to have Python match against; it is passed in the thing that `match` is running against, returning `None` if it know there's no chance a match will work 2. If `__match_args__` is present, then it is used to map positional arguments in the pattern to attributes on the proxy object 3. From there the `match` functionality does a bunch of comparisons against attributes on the proxy object to see if the match works
Is that right? That suggests all the work in implementing this for objects is coming up with a way to serialize an object to a proxy that makes pattern matching possible.
Yes, that's right, and the protocol was defined carefully so that the author of __match__ doesn't have to do any pattern matching -- all they have to do is produce an object that has the right attributes, and the interpreter does the rest. Note that it is __match__'s responsibility to check isinstance()! This is because __match__ may not want to use isinstance() but instead check for the presence of certain attributes -- IOW, the class pattern supports duck typing! (This was a little easter egg. :-)
Ah, so that's how you're going to have people simply match against protocols. :)
One thing I see mentioned in examples but not in the `__match__` definitions is how mappings work. Are you using `__match_args__` to map keys to attributes? Or are you using `__getitem__` and that just isn't directly mentioned? Otherwise the section on how `__match__` is used only mentioned attributes and never talks about keys.
Oh, __match__ is *only* used for class patterns. Mapping patterns are done differently. They don't use __getitem__ exactly -- the PEP says
Matched key-value pairs must already be present in the mapping, and not created on-the-fly by ``__missing__`` or ``__getitem__``. For example, ``collections.defaultdict`` instances will only match patterns with keys that were already present when the ``match`` block was entered.
You shouldn't try to depend on exactly what methods will be called -- you should just faithfully implement the Mapping protocol.
OK, so `__contains__` then. -Brett
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/N7USX5OB... Code of Conduct: http://python.org/psf/codeofconduct/
2) Use comma as the separator - this is already legal syntax too, but IMO it reads more naturally. (And IIRC there are already contexts where brackets are necessary to indicate a tuple.)
This would be my preferred option. Another possibility would to be allow cases to share the same suite by stacking them: case 401: case 402: case 403: ... Or perhaps both. The stacking option could be useful if the cases are complicated. -- Greg
On Wed, 24 Jun 2020 at 11:44, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Another possibility would to be allow cases to share the same suite by stacking them:
case 401: case 402: case 403: ...
I feel as though this could be useful (although I don't have a specific use case in mind). The obvious alternative would be to define a named function: def handle_40x(): ... case 401: handle_40x() case 402: handle_40x() case 403: handle_40x() But in spite of arguments about naming functions being cheap and good practice, I can still see myself missing the "stacked alternatives" form sometimes. Covering a few other points made in the thread: * I can't see much value in having `else:` - `case _` is a common way of writing a "catch all" pattern in many languages (although certainly "else" is used too) and I'd rather there be a single valid way, to avoid endless debates over style. We can't realistically disallow `case _`, as that's just a consequence of the general rules, so `else` should be the one to drop. * The use of `_` as wildcard feels fine to me (i18n notwithstanding). It's a common convention in Python, and in other languages that have match constructs. * The .VALUE syntax makes me sad, although I don't have a particularly good alternative to suggest. I'd certainly be able to learn to live with it, though. * Putting the expression on a line below the match keyword feels less natural to me than the current proposal. I don't have a particular logic or analogy for that (arguments about "it's like try...except" don't feel compelling to me), I just prefer the syntax in the PEP. Paul
On Wed, Jun 24, 2020 at 2:04 AM Guido van Rossum <guido@python.org> wrote:
def http_error(status): match status: case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else"
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"? It carries similar meaning but its special case of "this will never be bound" is simply preventing an error, rather than making one otherwise-valid name special.
Raw strings and byte strings are supported. F-strings are not allowed (since in general they are not really literals).
It won't come up often, but are triple-quoted strings allowed? ChrisA
On Jun 23, 2020, at 14:31, Chris Angelico <rosuav@gmail.com> wrote:
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"? It carries similar meaning but its special case of "this will never be bound" is simply preventing an error, rather than making one otherwise-valid name special.
I thought of that too as I was reading the PEP, but forgot to add it to my notes. I do like ellipsis more than underscore here. -Barry
On 2020-06-23 22:50, Barry Warsaw wrote:
On Jun 23, 2020, at 14:31, Chris Angelico <rosuav@gmail.com> wrote:
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"? It carries similar meaning but its special case of "this will never be bound" is simply preventing an error, rather than making one otherwise-valid name special.
I thought of that too as I was reading the PEP, but forgot to add it to my notes. I do like ellipsis more than underscore here.
+1 However, what if you wanted to match Ellipsis? This could lead to bugs:
... Ellipsis Ellipsis = 0 Ellipsis 0 ... Ellipsis
If you can have "case False:" and "case True:", should 'Ellipsis' become a keyword so that you could have "case Ellipsis:"? Or do they have to be "case .False:", "case .True:", in which case it could remain "case .Ellipsis:"?
On Tue, Jun 23, 2020 at 4:41 PM MRAB <python@mrabarnett.plus.com> wrote:
On 2020-06-23 22:50, Barry Warsaw wrote:
On Jun 23, 2020, at 14:31, Chris Angelico <rosuav@gmail.com> wrote:
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"? It carries similar meaning but its special case of "this will never be bound" is simply preventing an error, rather than making one otherwise-valid name special.
I thought of that too as I was reading the PEP, but forgot to add it to my notes. I do like ellipsis more than underscore here.
+1
The problem is that ellipsis already has a number of other meanings, *and* is easily confused in examples and documentation with leaving things out that should be obvious or uninteresting. Also, if I saw [a, ..., z] in a pattern I would probably guess that it meant "any sequence of length > 2, and capture the first and last element" rather than "a sequence of length three, and capture the first and third elements". (The first meaning is currently spelled as [a, *_, z].) So I'm not a fan. _ is what all other languages with pattern matching seem to use, and I like that case (x, _, _) resembles the use of (x, _, _) in assignment targets.
However, what if you wanted to match Ellipsis?
This could lead to bugs:
... Ellipsis Ellipsis = 0 Ellipsis 0 ... Ellipsis
Now you're just being silly. If you want to use Ellipsis as a variable you can't also use it to refer to the "..." token.
If you can have "case False:" and "case True:", should 'Ellipsis' become a keyword so that you could have "case Ellipsis:"? Or do they have to be "case .False:", "case .True:", in which case it could remain "case .Ellipsis:"?
I don't think this problem is at all bad enough to make "Ellipsis" a keyword. It is much, much less used than True, False or None. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Wed, Jun 24, 2020 at 10:11 AM Guido van Rossum <guido@python.org> wrote:
On Tue, Jun 23, 2020 at 4:41 PM MRAB <python@mrabarnett.plus.com> wrote:
On 2020-06-23 22:50, Barry Warsaw wrote:
On Jun 23, 2020, at 14:31, Chris Angelico <rosuav@gmail.com> wrote:
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"? It carries similar meaning but its special case of "this will never be bound" is simply preventing an error, rather than making one otherwise-valid name special.
I thought of that too as I was reading the PEP, but forgot to add it to my notes. I do like ellipsis more than underscore here.
+1
The problem is that ellipsis already has a number of other meanings, *and* is easily confused in examples and documentation with leaving things out that should be obvious or uninteresting. Also, if I saw [a, ..., z] in a pattern I would probably guess that it meant "any sequence of length > 2, and capture the first and last element" rather than "a sequence of length three, and capture the first and third elements". (The first meaning is currently spelled as [a, *_, z].)
Ah, yes, very good point. Agreed. ChrisA
On 2020-06-24 01:09, Guido van Rossum wrote:
On Tue, Jun 23, 2020 at 4:41 PM MRAB <python@mrabarnett.plus.com <mailto:python@mrabarnett.plus.com>> wrote: [snip]
However, what if you wanted to match Ellipsis?
This could lead to bugs:
>>> ... Ellipsis >>> Ellipsis = 0 >>> Ellipsis 0 >>> ... Ellipsis
Now you're just being silly. If you want to use Ellipsis as a variable you can't also use it to refer to the "..." token. [snip]
The point here is that printing ... shows "Ellipsis" (not "..."), printing None shows "None", etc. Printing Ellipsis also shows "Ellipsis", but you can bind to it. You can't bind to None, etc.
On 24/06/20 9:31 am, Chris Angelico wrote:
I can't find it among the rejected alternatives, but was it considered to use "..." as the wildcard, rather than "_"?
It currently has a value as an expression, so you might want to match against it. (I don't think the syntax in the PEP currently allows it to be used that way, but it could do). -- Greg
Hi, I read the PEP, and a few thoughts: ----- I think one of the examples is some lib2to3 code? I think the matcher syntax is really great for that case (parse trees). The matcher syntax is definitely an improvement over the litany of helper functions and conditionals otherwise needed. That said, I have a hard time seeing a particular use of this complicated pattern matching outside "hetergenous trees" (for lack of a better term) of objects? I've only really dealt with that problem with parse trees, but perhaps that just an artifact of the domains I've ended up working in. In any case, it might be useful to include some/more examples or use cases that aren't as parser-centric. ----- Question: How are True, False, None, ..., etc handled? What does this do? case whatever: case True: ... case False: ... case None: ... case ...: I would expect they would be treated as literals the same as e.g. numbers/strings, yes? Sorry if I missed this in the PEP. ----- I, too, had trouble understanding the __match__ protocol from the PEP text. Brett's comments largely capture my thoughts about this. ----- The need to use "." to indicate "look up name" to avoid "match anything" seems like a big foot gun. Simple examples such as: FOO = 1 match get_case(): case FOO: print("you chose one") clearly illustrate this, but the problem is present in any case expression: a missing dot changes the meaning from "match this specific value" to almost the opposite: "match any value". And all you really need to do is miss a single leading dot anywhere in the case expression to trigger this. I agree with Barry (I think he said this) that it seems like an easy cause of mysterious bugs. I think the foot-gun aspect derives directly from the change in how a symbol is interpreted. i.e., Everywhere (predominantly? everything I can think of atm) else in the language when you see "foo", you know it means some sort of lookup of the name "foo" is occurring. The exception to this is fairly simple: when there is some "assignment cue", e.g. "as", :=, =, import, etc, and those assignment cues are always very close by (pretty much always the leading/following token?). Anyways, my point is assignment has a cue close by. The proposed syntax flips that and mixes it, so it's very confusing. Sometimes a symbol is a lookup, sometimes it's an assignment. The PEP talks a bit about this in the "alternatives for constant value pattern" section. I don't find the rationale in that section particularly convincing. It basically says using "$FOO" to act as "look up value named FOO" is rejected because "it is new syntax for a narrow use case" and "name patterns are common in typical code ... so special syntax for the common case would be weird". I don't find that convincing because it seems *more weird* to change the (otherwise consistent) lookup/assignment behavior of the language for a specific sub-syntax. Anyways, when I rewrite the examples and use a token to indicate "matcher", I personally find them easier to read. I feel this is because it makes the matcher syntax feel more like templates or string interpolation (or things of that nature) that have some "placeholder" that gets "bound" to a value after being given some "input". It also sort of honors the "assignment only happens with a localized cue" behavior that already exists. ORIGIN = 0 case get_point(): case Point(ORIGIN, $end): ... case $default: print(default) I will admit this gives me PHP flashbacks, but it's also very clear where assignments are happening, and I can just use the usual name-lookup rules. I just used $ since the PEP did. As a bonus, I also think this largely mediates the foot gun problem because there's now a cue a binding is happening, so it's easy to trigger our "is that name already taken, is it safe to assign?" check we mentally perform. In any case, this seems like a pretty fundamental either/or design decision *someone* will have to make: Either: names mean assignment, and the rules of what is a lookup vs assignment are different with some special case support (i.e. leading dot). Or: use some character to indicate assignment, and the lookup rules are the same. ----- Related to the above: I also raise this because, in my usage, I doubt I'll be using it as much more than a switch statement. I rarely have to match complicated patterns, but very often have a set of values that I need to test against. The combination of Literal and exhaustive-case checking is very appealing. So I'm very often going to want to type, e.g. ValidModes = Union[Literal[A], Literal[B], etc etc] def foo(mode: ValidModes): match mode: case A: ... case B: ... case etc etc And eventually I'm going to foot-gun myself with a missing dot. ----- Related to the above, I *don't* find that e.g. "case Point(...)" *not* initializing a Point particularly confusing. This feels like it might be inconsistent with my whole thing above, but :shrug:. FWIW, I suspect it's just that the leading "case" cue makes it easy to entirely turn off the "parentheses means code gets called" logic in my mind-parser. ----- Related to the above, perhaps an unadorned name shouldn't be allowed? e.g. this should be invalid: match get_shape(): case shape: print(shape) I raise this idea because of the foot-gun issue, but also because it creates more ways of doing the same thing: binding the name to a value. Using := doesn't seem like a particularly burdensome solution: match shape := get_shape(): case: # or *, or _, or whatever print(shape) And then either only dotted names or patterns are allowed in cases, not plain names. ----- Making underscore a special match-anything-but-don't-bind struck me as a bit odd. Aside from the language grammar rules, there aren't really any "this is an OK name, this isn't" type of rules. I think someone else mentioned using "*" instead of "_"? I had the same exact same thought. If it's not going to be bound to a name, why use an otherwise valid name to not bind it to? I get the ergonomics of it, but it seems like another special-case of how things get processed inside the case expression. ----- Why | instead of "or" ? "or" is used in other conditionals. This strikes me as another special case of the syntax that differs from elsewhere in the language. ----- I agree with not having flat indentation. I think having "case" indented from "match" makes it more readable overall. ----- Anyways, thanks for reading. HTH. On Tue, Jun 23, 2020 at 9:08 AM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
On Jun 23, 2020, at 14:34, Richard Levasseur <richardlev@gmail.com> wrote:
I agree with not having flat indentation. I think having "case" indented from "match" makes it more readable overall.
I don’t know whether my suggestion to use `match:` and putting the expression inside this stanza can be accomplished, but I do want to add another point about that suggestion that I’ve heard from several folks I’ve already chatted with about this PEP. Using something like match: expression case arm1: pass case arm2: pass else: pass nicely mirrors try/except and if/elif/else constructs so it looks quite natural. Cheers, -Barry
On 6/23/2020 5:56 PM, Barry Warsaw wrote:
match: expression case arm1: pass case arm2: pass else: pass
nicely mirrors try/except and if/elif/else constructs so it looks quite natural.
Agreed as to the look of the form. Some observations: 1. 'else' is equivalent of 'elif True', so two alternative keywords are not strictly needed. The proposal uses 'case _' as the parallel catchall to 'elif True'. If '_' were not otherwise being used as a wildcard, I might suggest that 'case' by itself be used to always match. 2. 'expression' is *not* a general statement, let alone a suite of such, and it is not natural in non-interactive code that the object resulting from an expression be magically captured to be *implicitly* referenced in later code (the patterns). Even though match may be thought of as a type of multiple elifs, this has no parallel in if/elif/else statements. However, in try/except statements, raised exceptions *are* magically captured, to be *explicitly* referenced in 'except' tuples. 'Try' and 'match' both consist of a setup, multiple test:action pairs, and a default action. So a similar indent structure can easily seem appropriate. 3. A bonus of 'match' by itself is that re-based syntax colorizers, as in IDLE, could pretend that the keyword is 'match:' and avoid colorizing re.match, etc. Or a colorizer could check that 'match' is followed by ':'. Always colorizing 'case' seems like less of an issue as I think it is less common as a name. -- Terry Jan Reedy
On Tue, 23 Jun 2020 14:56:47 -0700 Barry Warsaw <barry@python.org> wrote:
On Jun 23, 2020, at 14:34, Richard Levasseur <richardlev@gmail.com> wrote:
I agree with not having flat indentation. I think having "case" indented from "match" makes it more readable overall.
I don’t know whether my suggestion to use `match:` and putting the expression inside this stanza can be accomplished, but I do want to add another point about that suggestion that I’ve heard from several folks I’ve already chatted with about this PEP. Using something like
match: expression case arm1: pass case arm2: pass else: pass
nicely mirrors try/except and if/elif/else constructs so it looks quite natural.
I find it quite un-natural. Regards Antoine.
This is awesome! What I love about this is that it strongly encourages people not to do EAFP with types (which I've seen many times), which causes problems when doing type annotations. Instead, if they use pattern matching, they're essentially forced to do isinstance without even realizing it. I love features that encourage good coding practices by design. My question is how does this work with polymorphic types? Typically, you might not want to fix everywhere the exact order of the attributes. It would be a shame for you to be dissuaded from adding an attribute to a dataclass because it would mess up every one of your case statements. Do case statements support extracting by attribute name somehow? case Point(x, z): means extract into x and y positionally. What if I want to extract by keyword somehow? Can something like this work? case Point(x=x, z=z): That way, if I add an attribute y, my case statement is just fine. I like the design choices. After reading a variety of comments, I'm looking forward to seeing the updated PEP with discussion regarding: case _: vs else: _ vs ... case x | y: vs case x or y: Best, Neil On Tue, Jun 23, 2020 at 12:07 PM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Jun 23, 2020 at 9:45 PM Neil Girdhar <mistersheik@gmail.com> wrote:
This is awesome!
What I love about this is that it strongly encourages people not to do EAFP with types (which I've seen many times), which causes problems when doing type annotations. Instead, if they use pattern matching, they're essentially forced to do isinstance without even realizing it. I love features that encourage good coding practices by design.
My question is how does this work with polymorphic types? Typically, you might not want to fix everywhere the exact order of the attributes. It would be a shame for you to be dissuaded from adding an attribute to a dataclass because it would mess up every one of your case statements. Do case statements support extracting by attribute name somehow?
case Point(x, z):
means extract into x and y positionally. What if I want to extract by keyword somehow? Can something like this work?
case Point(x=x, z=z):
That way, if I add an attribute y, my case statement is just fine.
Ah, never mind! You guys thought of everything.
I like the design choices. After reading a variety of comments, I'm looking forward to seeing the updated PEP with discussion regarding: case _: vs else: _ vs ... case x | y: vs case x or y:
Best, Neil
On Tue, Jun 23, 2020 at 12:07 PM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html ), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
def whereis(point):
match point: case MovingPoint(0, 0): print("Origin") case MovingPoint(0, y): print(f"Y={y}") case MovingPoint(x, 0): print(f"X={x}") case MovingPoint(1, 1): print("Diagonal at units")
case MovingPoint():
print("Somewhere else") case _: print("Not a point")
What is the expected/intended behavior or this kind of point. Both "What matches?" and "What is the value of point.x and point.y afterwards?" class MovingPoint: def __init__(self, x, y): self._x = x self._y = y @property def x(self): x = self._x self._x +=1 return x @property def y(self): y = self._y self._y += 1 return y point = MovingPoint(0, 0) whereis(point) -- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
On Wed., 24 Jun. 2020, 2:07 am Guido van Rossum, <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Very nice! As with some others, the main thing that gives me pause is the elevation of "_" to pseudo keyword status by making it the wildcard character for pattern matching. Users already find the existing common usages at least somewhat confusing [1], so adding a 5th use case will definitely make that situation worse. The first alternative I can see would be to adopt regex wildcard notation for match patterns as well: * exactly one arbitrary element: . * any number of arbitrary elements: .* * one or more arbitrary elements: .+ * zero or 1 arbitrary elements: .? And then frame the ".name" reference syntax as another form of element constraint (matching a value lookup rather than being completely arbitrary). Alternatively, the file globbing notation, "?", could be used as a new non-identifier symbol to indicate items that are present, but neither bound nor constrained (exactly as "_" is currently used in the PEP). This syntax would have the added bonus of potentially being added to iterable unpacking, such that "a, b, *? = iterable" would mean "retrieve the first two items without trying to retrieve more than that" (whereas the existing throwaway variable convention still exhausts the RHS even when you only care about the first few items). That said, I don't think the extra confusion generated by using "_" would be intolerable - I just think the possibility of using regex or globbing inspired wildcard notations is worth considering, and unless I missed something, the PEP doesn't currently cover any possible alternatives to using "_". Cheers, Nick. [1] https://stackoverflow.com/questions/5893163/what-is-the-purpose-of-the-singl...
On 06/23/2020 09:01 AM, PEP 622 wrote:
from enum import Enum
class Color(Enum): BLACK = 1 RED = 2
BLACK = 1 RED = 2
match color: case .BLACK | Color.BLACK: print("Black suits every color") case BLACK: # This will just assign a new value to BLACK. ...
As others have noted, the leading dot to disambiguate between a name assignment and a value check is going to be a problem. I think it's also unnecessary because instead of case BLACK: blahblah() we can do case _: # look ma! BLACK is just "color"! BLACK = color # if you really want it bound to another name In other words, the PEP is currently building in two ways to do the same thing -- make a default case. One of those ways is going to be a pain; the other, once renamed to "else", will be perfect! :-) As a bonus, no special casing for leading dots. -- ~Ethan~
On Wed, Jun 24, 2020 at 3:21 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/23/2020 09:01 AM, PEP 622 wrote:
from enum import Enum
class Color(Enum): BLACK = 1 RED = 2
BLACK = 1 RED = 2
match color: case .BLACK | Color.BLACK: print("Black suits every color") case BLACK: # This will just assign a new value to BLACK. ...
As others have noted, the leading dot to disambiguate between a name assignment and a value check is going to be a problem. I think it's also unnecessary because instead of
case BLACK: blahblah()
we can do
case _: # look ma! BLACK is just "color"! BLACK = color # if you really want it bound to another name
In other words, the PEP is currently building in two ways to do the same thing -- make a default case. One of those ways is going to be a pain; the other, once renamed to "else", will be perfect! :-) As a bonus, no special casing for leading dots.
But what if that's composed into something else? class Room(Enum): LIBRARY = 1 BILLIARD_ROOM = 2 ... match accusation: case (Color.SCARLETT, Room.BILLIARD_ROOM): print("Correct") case (Color.SCARLETT, _): print("Not there!") case (_, Room.BILLIARD_ROOM): print("Wrong person!") case (_, _): print("Nope. Just nope.") Without the dots, there's no way to tell whether you're matching specific values in the tuple, or matching by length alone and then capturing. You can't use the 'else' keyword for a partial match. ChrisA
On 06/23/2020 10:31 PM, Chris Angelico wrote:
On Wed, Jun 24, 2020 at 3:21 PM Ethan Furman wrote:
On 06/23/2020 09:01 AM, PEP 622 wrote:
from enum import Enum
class Color(Enum): BLACK = 1 RED = 2
BLACK = 1 RED = 2
match color: case .BLACK | Color.BLACK: print("Black suits every color") case BLACK: # This will just assign a new value to BLACK. ...
As others have noted, the leading dot to disambiguate between a name assignment and a value check is going to be a problem. I think it's also unnecessary because instead of
case BLACK: blahblah()
we can do
case _: # look ma! BLACK is just "color"! BLACK = color # if you really want it bound to another name
In other words, the PEP is currently building in two ways to do the same thing -- make a default case. One of those ways is going to be a pain; the other, once renamed to "else", will be perfect! :-) As a bonus, no special casing for leading dots.
But what if that's composed into something else?
class Room(Enum): LIBRARY = 1 BILLIARD_ROOM = 2 ...
match accusation: case (Color.SCARLETT, Room.BILLIARD_ROOM): print("Correct") case (Color.SCARLETT, _): print("Not there!") case (_, Room.BILLIARD_ROOM): print("Wrong person!") case (_, _): print("Nope. Just nope.")
Without the dots, there's no way to tell whether you're matching specific values in the tuple, or matching by length alone and then capturing. You can't use the 'else' keyword for a partial match.
Well, your example isn't using leading dots like `.BLACK` is, and your example isn't using `case _` as a final catch-all, and your example isn't using "case some_name_here" as an always True match. In other words, your example isn't talking about what I'm talking about. ;-) -- ~Ethan~
On Wed, Jun 24, 2020 at 3:49 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 06/23/2020 10:31 PM, Chris Angelico wrote:
On Wed, Jun 24, 2020 at 3:21 PM Ethan Furman wrote:
On 06/23/2020 09:01 AM, PEP 622 wrote:
from enum import Enum
class Color(Enum): BLACK = 1 RED = 2
BLACK = 1 RED = 2
match color: case .BLACK | Color.BLACK: print("Black suits every color") case BLACK: # This will just assign a new value to BLACK. ...
As others have noted, the leading dot to disambiguate between a name assignment and a value check is going to be a problem. I think it's also unnecessary because instead of
case BLACK: blahblah()
we can do
case _: # look ma! BLACK is just "color"! BLACK = color # if you really want it bound to another name
In other words, the PEP is currently building in two ways to do the same thing -- make a default case. One of those ways is going to be a pain; the other, once renamed to "else", will be perfect! :-) As a bonus, no special casing for leading dots.
But what if that's composed into something else?
class Room(Enum): LIBRARY = 1 BILLIARD_ROOM = 2 ...
match accusation: case (Color.SCARLETT, Room.BILLIARD_ROOM): print("Correct") case (Color.SCARLETT, _): print("Not there!") case (_, Room.BILLIARD_ROOM): print("Wrong person!") case (_, _): print("Nope. Just nope.")
Without the dots, there's no way to tell whether you're matching specific values in the tuple, or matching by length alone and then capturing. You can't use the 'else' keyword for a partial match.
Well, your example isn't using leading dots like `.BLACK` is, and your example isn't using `case _` as a final catch-all, and your example isn't using "case some_name_here" as an always True match. In other words, your example isn't talking about what I'm talking about. ;-)
But it IS using "_" as a catch-all. The simple "case _:" case is using _ the same way that "case (Color.SCARLETT, _):" is. ChrisA
On 06/23/2020 10:11 PM, Ethan Furman wrote:
As others have noted, the leading dot to disambiguate between a name assignment and a value check is going to be a problem.
I suspect I wasn't as clear as I could have been, so let me try again (along with some thinking-out-loud...). First premise: the visual difference between BLACK and .BLACK is minuscule, and will trip people up. Second premise: there is no practical difference between match color: case BLACK: # do stuff and match color: case _: BLACK = color # do stuff Conclusion: there is no need for an always true match, since that's basically the same thing as an "else" (or "case _", currently). Since we don't need an always True match, we don't need to allow a single name after "case" as an assignment target, which means we don't need to support ".name" as a value target -- the plain name will work as a value target. Am I just stuck on the single-name scenario and missing the bigger picture? What happens here: aught = 0 match an_obj: case Point(aught, 6): # do stuff for the aught in Point to match the global aught it needs the dot prefix, doesn't it. But this is what guard clauses are for, right? aught = 0 match an_obj: case Point(x, 6) if x == aught: # do stuff So we still don't need a leading dot. Going back to the example and taking a different tack [1]: match color: case BLACK: # BLACK is the global variable is really the same as match color: case color == BLACK: We've been dropping the "color ==" part, but what if we only drop the "color" part? match color: case == BLACK: # comparing against the global variable and then match color: case BLACK: # always True match with `color` assigned to `BLACK` Equal signs are much easier to notice than a single dot. Okay, I've pretty much done a 180 here -- a bare name should be an assignment target, but instead of having a leading dot be the "this is really an existing variable switch", have an operator be that switch: ==, <, >=, etc. Of course "==" will be the most common, so it can be dropped as well when the meaning doesn't change in its absence (kind of like parentheses usually being optional). -- ~Ethan~ [1] https://en.wikipedia.org/wiki/Tack_(sailing)
On Wed, 24 Jun 2020 at 07:40, Ethan Furman <ethan@stoneleaf.us> wrote:
Second premise: there is no practical difference between
match color: case BLACK: # do stuff
and
match color: case _: BLACK = color # do stuff
You've already changed your position, so this is only marginally relevant now, but if the match expression is long and complex, duplicating it is messy. Using the walrus operator is an alternative, but it's not clear to me how readable that would be if it were only needed for one case out of many. I'd need to see a "real" example to have a good feel on that. (FWIW, I do find that with my ageing eyes, it's easy to miss the dot prefix. And I feel that the use of a dot would make Uncle Timmy sad :-() Paul
On Tue, Jun 23, 2020 at 9:12 AM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
...
I'll mostly let the PEP speak for itself:
- Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
I have an exploratory question. In this section: The alternatives may bind variables, as long as each alternative binds the
same set of variables (excluding _). For example: match something: ... case Foo(arg=x) | Bar(arg=x): # Valid, both arms bind 'x' ... ...
Tweaking the above example slightly, would there be a way to modify the following so that, if the second alternative matched, then 'x' would have the value, say, None assigned to it? match something:
... case Foo(arg=x) | Bar() (syntax assigning, say, None to x?) ... ...
That would let Bar be handled by the Foo case even if Bar doesn't take an argument. I'm not sure if this would ever be needed, but it's something I was wondering. I didn't see this covered but could have missed it. --Chris
On Wed, Jun 24, 2020 at 3:02 AM Chris Jerdonek <chris.jerdonek@gmail.com> wrote:
I have an exploratory question. In this section:
The alternatives may bind variables, as long as each alternative binds the
same set of variables (excluding _). For example: match something: ... case Foo(arg=x) | Bar(arg=x): # Valid, both arms bind 'x' ... ...
Tweaking the above example slightly, would there be a way to modify the following so that, if the second alternative matched, then 'x' would have the value, say, None assigned to it?
match something:
... case Foo(arg=x) | Bar() (syntax assigning, say, None to x?) ... ...
That would let Bar be handled by the Foo case even if Bar doesn't take an argument. I'm not sure if this would ever be needed, but it's something I was wondering. I didn't see this covered but could have missed it.
That sounds like you are planning to put an 'if x is not None' check in the block. In most cases it would probably be cleaner to separate this out into two cases. (And yes, I can think of counterexamples, but they don't feel compelling enough to try and invent such syntax.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
A cursory search of this thread suggests that no one has mentioned this yet, but apologies if I missed one of the existing replies about this. In regards to https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa..., was this alternative considered? ``` match obj: case SomeClass(field := _): # is this already allowed by the PEP? pass case some_constant: # my proposal: do not require `.some_constant` here pass case (other := _): # is this already allowed by the PEP? If so, do we need the extra `case other:` spelling? print(other) ``` It seems like `:=` already provides all the necessary syntax for distinguishing bindings from constants, admittedly at the cost of 6 characters per binding (eg `Point(x := _, y := _)`) - so introducing additional syntax seems unnecessary. If this was considered but rejected for verbosity concerns, it would be nice to see it mentioned in the rejected alternatives section. Eric
On Wed, Jun 24, 2020 at 5:23 AM Eric Wieser <wieser.eric+numpy@gmail.com> wrote:
In regards to https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa..., was this alternative considered? ``` match obj: case SomeClass(field := _): # is this already allowed by the PEP? pass case some_constant: # my proposal: do not require `.some_constant` here pass case (other := _): # is this already allowed by the PEP? If so, do we need the extra `case other:` spelling? print(other) ```
It seems like `:=` already provides all the necessary syntax for distinguishing bindings from constants, admittedly at the cost of 6 characters per binding (eg `Point(x := _, y := _)`) - so introducing additional syntax seems unnecessary.
In languages that have pattern matching, it is the primary way to extract pieces of a compound data structure into individual variables. For the use cases where match would be a good fit in Python, the same will be true. So using your proposed syntax here is too verbose to consider.
If this was considered but rejected for verbosity concerns, it would be nice to see it mentioned in the rejected alternatives section.
We can't discuss every single idea in that section. I don't think anyone else has proposed this, so I don't think it needs to be discussed for posterity. There are plenty of better ideas in this thread that deserve a mention there. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Thanks for the reply. Independent of whether the spelling is encouraged, does the PEP in its current form consider `(name := _)` to be legal, or is `_` forbidden on the RHS of a `:=` by a similar argument to forbidding `_` in `**_`? Eric On Wed, 24 Jun 2020 at 16:12, Guido van Rossum <guido@python.org> wrote:
On Wed, Jun 24, 2020 at 5:23 AM Eric Wieser <wieser.eric+numpy@gmail.com> wrote:
In regards to https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa..., was this alternative considered? ``` match obj: case SomeClass(field := _): # is this already allowed by the PEP? pass case some_constant: # my proposal: do not require `.some_constant` here pass case (other := _): # is this already allowed by the PEP? If so, do we need the extra `case other:` spelling? print(other) ```
It seems like `:=` already provides all the necessary syntax for distinguishing bindings from constants, admittedly at the cost of 6 characters per binding (eg `Point(x := _, y := _)`) - so introducing additional syntax seems unnecessary.
In languages that have pattern matching, it is the primary way to extract pieces of a compound data structure into individual variables. For the use cases where match would be a good fit in Python, the same will be true. So using your proposed syntax here is too verbose to consider.
If this was considered but rejected for verbosity concerns, it would be nice to see it mentioned in the rejected alternatives section.
We can't discuss every single idea in that section. I don't think anyone else has proposed this, so I don't think it needs to be discussed for posterity. There are plenty of better ideas in this thread that deserve a mention there.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Wed, Jun 24, 2020 at 8:26 AM Eric Wieser <wieser.eric+numpy@gmail.com> wrote:
Thanks for the reply.
Independent of whether the spelling is encouraged, does the PEP in its current form consider `(name := _)` to be legal, or is `_` forbidden on the RHS of a `:=` by a similar argument to forbidding `_` in `**_`?
It is legal. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
As a tangential follow-up to my thought that _"the `:=` walrus operator seems to be usable as a substitute for new syntax"_, you can actually build a seemingly complete (if somewhat error-prone) _run-time_ pattern matching API using it: https://gist.github.com/eric-wieser/da679ff9b1b1e99aa660d54cb0dbd517 Taking the `group_shapes` example from the PEP, the gist lets you write ```python g = group_shapes() if g in M(([], [point := M[Point](x := M._, y := M._), *(other := M._)])): print(f"Got {point.value} in the second group") process_coordinates(x.value, y.value) elif g in M(...): # etc ``` This isn't an argument against adding new syntax, but I figured those discussing the PEP might be interested in what can already be done with today's syntax. Eric
Hello everyone, this is my first crack at commenting on a PEP, so apologies for mistaking any developer colloquialisms, or if this is the wrong channel to go through. In a nutshell, I was mulling over my initial difficulty in understanding name patterns and had the thought of ‘declaring’ so-called ‘placeholder’ variables at the beginning of the match block, so that the reader is ‘primed’ for their use throughout: x_literal = a_value y_literal = another_value match point: proxy x, y case (x_literal, y): print(f”Y={y} at literal X”) case (x, y_literal): print(f”X={x} at literal Y”) case (x_literal, y_literal): print(“That is the literal point”) case (x, y): print(f”Arbitrary point: X={x}, Y={y}”) case _: raise ValueError(“Not a point”) The use of ‘proxy’ instead of ‘placeholder’ is simply an attempt at brevity without diverging substantially from the meaning. The use of a placeholder variable reminded me a lot of comprehensions, which are explicit in where placeholders come from, but only towards the end of the comprehension: y = [f(a, b) for a, b in iterable] A contrasting example is the lambda function, which ‘declares’ its placeholders upfront: y = map(lambda x, y: f(x, y), iterable) In reading these two examples left to right, I believe that it is easier to understand the lambda function in a first pass, whereas the list comprehension often requires looking back at the expression after reading the ‘for’ clause at the end. This difference in first-pass understanding becomes more apparent when the expression or function is more complex. In this vein, ‘declaring’ placeholder variables upfront makes sense to me when the context in which they can be used is (relatively) large and not self-contained. Now, this does add “new syntax”, which is addressed in the ‘Alternatives for constant value pattern’ section of the PEP; but in reading this thread, it seems like such a solution remains appealing. For myself, a one-off ‘declaration’ with a reasonably unambiguous keyword like proxy makes the code easier to follow as one reads through it, without the need for variables to be typed out differently (consistent with all other variable usage). I reckon something like this idea would improve comprehensibility (not just raw readability) and is more in line with other prose-like constructs in Python (try/except, with, and/or). This is particularly important for people that are new to either this feature (everyone at first), or the language as a whole. Not being all too familiar with CPython implementation details myself, my idea is certainly open to technical critique (and of course qualitative impressions). As a bit of a stab in the dark, ‘proxy’ could be a soft keyword within a match block, similar to ‘match’ being a soft global keyword.
On 23/06/2020 17:01, Guido van Rossum wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Can I just say thanks for all this work. I really like the concept, but like everyone else I have opinions on the details :-) I didn't have time to read the PEP until about midnight last night, and responding then seemed like a bad idea, so apologies about being late to this game. My basic problem can be summed up as the more I read, the more it seemed like exceptions were breeding. Running through the basics, I'm very happy with the indentation of match expression: case pattern1: suite1() case pattern2: suite2() since it fits exactly how I indent switch statements in C :-) The patterns are where things start coming unstuck. Literal patterns: fine, no problem, everything works as you would expect. Not being able to use expressions is a disappointment I'm used to. Name patterns: um. OK, I can cope with that. If you squint, it looks like an assignment and/or unpacking, _except_ for "_". Which, what? Just because I use "_" as a throwaway doesn't mean I'll never want to use it as an actual name. Do we actually need a non-binding wildcard anyway? It may make some matches a little faster, I suppose, but we're always being told names are cheap. And yes, I think "case _:" is a rubbish way of spelling "else:". I'd honestly be more likely to write "case everything_else:" or even "case dummy:" than "case _:" just to be more readable. Constant value patterns: now I'm getting really uneasy. A leading dot is all but invisible to the reader, and we are compounding the specialness of "_". I'm having to squint harder to see name patterns as assignments. The exceptions to how I would normally read Python code are breeding and getting more complicated. Sequence patterns: and we have more exceptions. I guess the current syntax would need some cooperation from the string classes, but I'd quite like to be able to take byte protocols apart and do something like match msg: case bytes((len, 0, cmd, *rest)): process_command[cmd](len, rest) case bytes((len1, len2, 0, cmd, *rest)) if len1 >= 0x80: process_command[cmd]((len1 & 0x7f) | (len2 << 7), rest) else: handle_comms_error(msg) (the protocol I'm currently working on is that horrid :-) Mapping patterns: makes sense, pace my uneasiness about names. Class patterns: this crosses the line from uneasy to outright "no". I'm fairly confident I will never read "case Point(x,y):" without thinking first there's an instantiation happening. It gets even worse when you add named subpatterns, because in case name := Class(x, y) the "Class(x,y)" part both is and is not an instantiation. I honestly stared at the example in the PEP for a good ten minutes before I grokked it properly. I only have a problem with using "|" to combine patterns because I'd really like to have expressions as patterns :-) The way that exceptions to the usual rules of reading Python got more numerous and more complicated the further I read through the PEP makes me think the approach to when to use name-binding and when to use values may be arse-backwards. The PEP justifies its approach by pointing out that name patterns are more common in typical code, which is fine for name patterns, but looks weird for class patterns and really weird when you involve named subpatterns. Here's a quick sketch of rearranging the syntax with that in mind. Bits of it aren't lovely, but I still think they read more naturally than the current PEP. Literal patterns: as before. It's a classic. Constant value patterns: just use the name: BLACK = 1 match colour: case "Not a colour, guv": print("Um") case BLACK: print("Paint It Black!") There's an obvious generalisation to constant expressions that would be really nice from my point of view. Class patterns: don't use syntax that looks like instantiation! case Point as x, y: (I used "as" because it's short. Someone else suggested "with", which probably makes more sense.) This gets a little messy when patterns start nesting. case Line as (start := (Point as x, y), end) if start == end: but the original example in the PEP is just as messy and puts a lot more of the wrong thoughts in my head. Name patterns fall naturally out of this, even if they look a little unusual: case int as x: The catch-all case object as obj: looks odd, but actually I don't have a problem with that. How often should code be catching any old thing rather than something of a more specific class or classes? If the catch-all looks odd, perhaps it will dissuade people from using it thoughtlessly. Sequence and mapping patterns are a bit more than just syntactic sugar for class patterns for list, tuple, and dict. I wouldn't change them, given other changes to the definition of "pattern" here. To me, this is just easier to wrap my head around. Patterns are either expressions, classes or sequence/mapping combinations of patterns. There's no awkwardness about when is a name a value and when is it something to be bound, there's no proliferation of special cases, and it is pretty readable I think. It could be argued to be verbose, but terseness is not one of Python's objectives, and I think the consistency is worth it. -- Rhodri James *-* Kynesim Ltd
On 06/23/2020 09:01 AM, Guido van Rossum wrote:
PEP 622
Okay, I took Paul Moore's advice and went looking in existing code for some examples, and came across the following: if isinstance(v, int): # v is an offset elif isinstance(v, str): # v is a docstring elif isinstance(v, tuple) and len(v) in (2, 3) and isinstance(v[0], baseinteger) and isinstance(v[1], (basestring, NoneType)): # v is an offset, a docstring, and (maybe) a default elif isinstance(v, tuple) and len(v) in (1, 2) and isinstance(v[0], (basestring, NoneType)): # v is a docstring and (maybe) a default That seems like it would be a perfect match (hah) for the new syntax, but I am not confident in my efforts. This is what I started with: match v: # goal here is to turn v into an (offset, docstring, default value) case int: v = v, None, None case str: v = None, v, None case (str, ): # how to combine with above case? v = None, v[0], None case (int, str): v += (None, ) case (int, str, default): pass case (str, default): v = None, v[0], v[1] Which got me to here: match v: # goal here is to turn v into an (offset, docstring, default value) case int(offset): v = offset, None, None case str(doc) | (str(doc), ): v = None, doc, None case (int(offset), str(doc)): v = offset, doc, None case (int(offset), str(doc), default): # already correct pass case (str(doc), default): v = None, doc, default Is this correct? Side note: I would much rather read "case str(doc) or (str(doc), )" instead of a |. -- ~Ethan~
On Wed, Jun 24, 2020 at 7:40 AM Ethan Furman <ethan@stoneleaf.us> wrote:
Okay, I took Paul Moore's advice and went looking in existing code for some examples, and came across the following:
if isinstance(v, int): # v is an offset elif isinstance(v, str): # v is a docstring elif isinstance(v, tuple) and len(v) in (2, 3) and isinstance(v[0], baseinteger) and isinstance(v[1], (basestring, NoneType)): # v is an offset, a docstring, and (maybe) a default elif isinstance(v, tuple) and len(v) in (1, 2) and isinstance(v[0], (basestring, NoneType)): # v is a docstring and (maybe) a default
That seems like it would be a perfect match (hah) for the new syntax, but I am not confident in my efforts.
This is what I started with:
match v: # goal here is to turn v into an (offset, docstring, default value) case int: v = v, None, None case str: v = None, v, None case (str, ): # how to combine with above case? v = None, v[0], None case (int, str): v += (None, ) case (int, str, default): pass case (str, default): v = None, v[0], v[1]
Which got me to here:
match v: # goal here is to turn v into an (offset, docstring, default value) case int(offset): v = offset, None, None case str(doc) | (str(doc), ): v = None, doc, None case (int(offset), str(doc)): v = offset, doc, None case (int(offset), str(doc), default): # already correct pass case (str(doc), default): v = None, doc, default
Is this correct?
Yes. Side note: I would much rather read "case str(doc) or (str(doc), )" instead
of a |.
Duly noted, we'll come back to this. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Wow, so 19 years after PEP 275, we are indeed getting a switch statement. Nice :-) Something which struck me as odd when first scanning through the PEP is the default case compared to other Python block statements: match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") case _: print("Something else") rather than what a Pythonista would probably expect: match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else") Was there a reason for using a special value "_" as match-all value ? I couldn't find any explanation for this in the PEP. Cheers. On 23.06.2020 18:01, Guido van Rossum wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jun 24 2020)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On 24.06.2020 16:27, M.-A. Lemburg wrote:
Wow, so 19 years after PEP 275, we are indeed getting a switch statement. Nice :-)
Something which struck me as odd when first scanning through the PEP is the default case compared to other Python block statements:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") case _: print("Something else")
rather than what a Pythonista would probably expect:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else")
Was there a reason for using a special value "_" as match-all value ? I couldn't find any explanation for this in the PEP.
To clarify: The Python compiler could turn the "else:" into what "case _:" would produce. The syntax would just look more intuitive, IMO. The question was not about using "_" as match-all in general.
Cheers.
On 23.06.2020 18:01, Guido van Rossum wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido <http://python.org/~guido>) /Pronouns: he/him //(why is my pronoun here?)/ <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
-- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jun 24 2020)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Wed, Jun 24, 2020 at 7:27 AM M.-A. Lemburg <mal@egenix.com> wrote:
Wow, so 19 years after PEP 275, we are indeed getting a switch statement. Nice :-)
Indeed. Fortunately there are now some better ideas to steal from other languages than C's switch. :-) Something which struck me as odd when first scanning through the PEP
is the default case compared to other Python block statements:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") case _: print("Something else")
rather than what a Pythonista would probably expect:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else")
Was there a reason for using a special value "_" as match-all value ? I couldn't find any explanation for this in the PEP.
Nearly every other language whose pattern matching syntax we've examined uses _ as the wildcard. The authors don't feel very strongly about whether to use `else:` or `case _:`. The latter would be possible even if we added an explicit `else` clause, and we like TOOWTDI. But it's clear that a lot of people *expect* to see `else`, and maybe seeing `case _:` is not the best introduction to wildcards for people who haven't seen a match statement before. A wrinkle with `else` is that some of the authors would prefer to see it aligned with `match` rather than with the list of cases, but for others it feels like a degenerate case and should be aligned with those. (I'm in the latter camp.) There still is a lively internal discussion going on, and we'll get back here when we have a shared opinion. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 24.06.2020 20:08, Guido van Rossum wrote:
On Wed, Jun 24, 2020 at 7:27 AM M.-A. Lemburg <mal@egenix.com <mailto:mal@egenix.com>> wrote:
Wow, so 19 years after PEP 275, we are indeed getting a switch statement. Nice :-)
Indeed. Fortunately there are now some better ideas to steal from other languages than C's switch. :-)
Your PEP certainly is a lot more powerful than the good ol' C switch :-) Something I know the Perl camp is always very fond of is the matching on regexps. Is this possible using the proposal (sorry, I don't quite understand the __match__() protocol yet) ?
Something which struck me as odd when first scanning through the PEP is the default case compared to other Python block statements:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") case _: print("Something else")
rather than what a Pythonista would probably expect:
match something: case 0 | 1 | 2: print("Small number") case [] | [_]: print("A short sequence") case str() | bytes(): print("Something string-like") else: print("Something else")
Was there a reason for using a special value "_" as match-all value ? I couldn't find any explanation for this in the PEP.
Nearly every other language whose pattern matching syntax we've examined uses _ as the wildcard.
The authors don't feel very strongly about whether to use `else:` or `case _:`. The latter would be possible even if we added an explicit `else` clause, and we like TOOWTDI. But it's clear that a lot of people *expect* to see `else`, and maybe seeing `case _:` is not the best introduction to wildcards for people who haven't seen a match statement before.
A wrinkle with `else` is that some of the authors would prefer to see it aligned with `match` rather than with the list of cases, but for others it feels like a degenerate case and should be aligned with those. (I'm in the latter camp.)
There still is a lively internal discussion going on, and we'll get back here when we have a shared opinion.
Great. Thanks for considering it. I'd make "else" match its use in other statements, which would mean aligning it with "match", but don't really feel strong about either way. The problem I see with "case _" is that it's just too easy to miss when looking at the body of "match", even more so, since people will not necessarily put it at the end, or add it as or'ed add-on to some other case, e.g. match something: case 0 | 1 | 2 | _: print("Small number or something else") case [] | [_]: print("A short sequence") case _: print("Not sure what this is") case str() | bytes(): print("Something string-like") You could even declare the above stand-alone or or'ed use of "_" illegal and force use of "else" instead to push for TOOWTDI. Oh, and thanks for not having continue / break in the switch cases ! Those tend to often cause subtle bugs in C applications (ie. a missing break). -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jun 24 2020)
Python Projects, Coaching and Support ... https://www.egenix.com/ Python Product Development ... https://consulting.egenix.com/
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/
On Wed, 24 Jun 2020 at 19:49, M.-A. Lemburg <mal@egenix.com> wrote:
match something: case 0 | 1 | 2 | _: print("Small number or something else") case [] | [_]: print("A short sequence") case _: print("Not sure what this is") case str() | bytes(): print("Something string-like")
Because the semantics is "first matching clause applies", putting `case _` anywhere but at the end wouldn't work as expected. The `str() | bytes()` case above would never match. Paul
On Wed, Jun 24, 2020 at 11:47 AM M.-A. Lemburg <mal@egenix.com> wrote:
Something I know the Perl camp is always very fond of is the matching on regexps. Is this possible using the proposal (sorry, I don't quite understand the __match__() protocol yet) ?
No, that's left for another day. Scala has string matching built into its pattern matching syntax, but for Python it would be an uphill battle to try to compete with regular expression syntax.
The problem I see with "case _" is that it's just too easy to miss when looking at the body of "match", even more so, since people will not necessarily put it at the end, or add it as or'ed add-on to some other case, e.g.
match something: case 0 | 1 | 2 | _: print("Small number or something else") case [] | [_]: print("A short sequence") case _: print("Not sure what this is") case str() | bytes(): print("Something string-like")
That's just a bug in the user's code. We can't *stop* users from writing "case _:" -- note there are even more ways to spell it, e.g. "case x:" (where x is otherwise unused), or "case object():".
You could even declare the above stand-alone or or'ed use of "_" illegal and force use of "else" instead to push for TOOWTDI.
I guess we *could* syntactically disallow 0|_, but why bother? I don't expect anyone is going to write that and then expect the next case to be reachable. When it comes to catching unreachable code I think we have bigger fish to fry (e.g. f.close without the ()). That said, we're on the fence on adding else. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 06/24/2020 11:08 AM, Guido van Rossum wrote:
Nearly every other language whose pattern matching syntax we've examined uses _ as the wildcard.
The authors don't feel very strongly about whether to use `else:` or `case _:`. The latter would be possible even if we added an explicit `else` clause, and we like TOOWTDI. But it's clear that a lot of people *expect* to see `else`, and maybe seeing `case _:` is not the best introduction to wildcards for people who haven't seen a match statement before.
It seems to me that TOOWTDI is already violated. I don't think most people will see a significant different between: case _: and case a_name: As far as alignment, I think "else" should align with "match" -- it more strongly signifies that no match was found - at least to me ;) . -- ~Ethan~
Wow, this looks fantastic, I've wanted this for a long time! I really appreciate the work that has been put here. As a bit of context for this and future emails, I fury wanted to say: * I'm strongly positive about this even if I propose changes * I've explored a bit on my own how to do this, so I'm (painfully) aware of some of the design tensions, in particular the "a pattern is neither an expression(something that you'd fight on the RHS of an assignmen) nor a target (something you'd find in the LHS of an assignment), but has a little bit of both" That being send, here are my comments (which I'll split into separate emails so we can thread separately): *The "We use a name token to denote a capture variable and special syntax to denote matching the value of that variable" feels a bit like a foot-gun* Other people had said this so, this is mostly a +1, but I wanted to provide alternatives, and also reinforce that deciding in this way is much more likely to cause mistakes than the opposite. The most likely mistake seems for me to be using the normal instead of the "special" syntax (the dot prefix); if i write "case constant" by mistake now, I'm getting 1) a match when there likely wasn't one and 2) I clobbered my constant. Both are hard to debug. If we had special syntax for capture and I wrote "case variable" by mistake, I would likely get a NameError, which should be easier to figure out. I saw some of the rejected approaches (like capturing with $var) and I found them visually ugly, so I want to put two others on the table, which I think I found "reasonable": - 1.A: use angle brackets for capture: match get_node(): case Node(value=<x>, color=RED): print(f"a red node with value {x}") case Node(value=<x>, color=BLACK): print(f"a black node with value {x}") case <n>: print(f"This is a funny colored node with value {n.value}") This agains adds new syntax like the original proposal, but I think it makes the capture quite visible without feeling noisy; it reminds me of metaparameters in the help for command line tools (i.e. "cp <src-file> <dest-file>"). Even if we had this, I'm happy with "_" as a placeholder (rather than "<_>"). It should be posible to have <*x> or <**x> in other patterns (rather than *<x> or **<x>)... and I would probably be happy of using standalone * and ** rather than *_ and **_ (this last suggestion could work with the current syntax too). - 1.B. use a "capture object". No "new" syntax (syntax for patterns is new, but looks like other expressions): match get_node() into c: case Node(value=c.x, color=RED): print(f"a red node with value {c.x}") case Node(value=c.x, color=BLACK): print(f"a black node with value {c.x}") case c.n: print(f"This is a funny colored node with value {c.n.value}") The idea here is that instead of spreading captured names into the local namespace, we only have a single capture object in the locals, and all captures happen inside it. This also allows to syntactically (although not in the grammar) to recognize what is a variable capture and what isn't. This one is somewhat more verbose (specially if you use a longer name for the "capture") but looks much more familiar to pythonistas (and to IDE syntax highlighters ;-) ). I added the "into c" syntax without thinking too much, perhaps using "as c" or "in c", or "match(c)" could be better, I didn't want to stop much into that part before discussing the idea of using a "capture object". The capture object is mostly an attribute placeholder (I might like it to have an extra attribute to get the original matched value which is generally useful and might be easier than using a walrus, but this is a minor feature). What does the rest of the community (and the original authors think about these alternatives? On Tue, 23 Jun 2020 at 17:04, Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
match get_node() into c:
+1 to scoping name pattern related objects upfront. (mirrors my post, so no bias :P) Using a namespace to group capture variables is a good idea, though new attributes are introduced throughout the match block. In my view, this is very similar to the use a special character like '$', the difference being that such a modifier can be 'named'. I second your idea of an 'into' keyword to align the match statement with others in Python. It was a consideration of mine, but I was likewise wary of the first-line verbosity. I think it's worth it for the increased familiarity, especially since capture variables are in a sense auxiliary to the match statement (so just pop them at the end). Flattening your capture object into individual variables does this while still being explicit from the beginning about what variables can be used for name matching: match get_node() into a, b, c: As far as the naming of the 'into' keyword, I think yours is a solid candidate. Others are sure to present a healthy range of alternatives I'm sure :). My only preference would be that the qualification of the match statement is appended, rather than done in place like match(a, b, c) get_node():
Guido van Rossum wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin. Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-). I'll mostly let the PEP speak for itself:
Published: https://www.python.org/dev/peps/pep-0622/ (*) Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon. I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes! I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself: # Pattern Matching This repo contains a draft PEP proposing a match statement. Origins The work has several origins:
Many statically compiled languages (especially functional ones) have a match expression, for example Scala, Rust, F#; Several extensive discussions on python-ideas, culminating in a summarizing blog post by Tobias Kohn; An independently developed draft PEP by Ivan Levkivskyi.
Implementation A full reference implementation written by Brandt Bucher is available as a fork) of the CPython repo. This is readily converted to a pull request). Examples Some example code is available from this repo. Tutorial A match statement takes an expression and compares it to successive patterns given as one or more case blocks. This is superficially similar to a switch statement in C, Java or JavaScript (an many other languages), but much more powerful. The simplest form compares a target value against one or more literals: def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else"
Note the last block: the "variable name" _ acts as a wildcard and never fails to match. You can combine several literals in a single pattern using | ("or"): case 401|403|404: return "Not allowed"
Patterns can look like unpacking assignments, and can be used to bind variables: # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point")
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is extracted from the target value (point). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment (x, y) = point. If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables: from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point")
We can use keyword parameters too. The following patterns are all equivalent (and all bind the y attribute to the var variable): Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1)
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this: match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else")
We can add an if clause to a pattern, known as a "guard". If the guard is false, match goes on to try the next case block. Note that variable extraction happens before the guard is evaluated: match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal")
Several other key features:
Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of collections.abc.Sequence.)
Sequence patterns support wildcards: [x, y, *rest] and (x, y, *rest) work similar to wildcards in unpacking assignments. The name after * may also be _, so (x, y, *_) matches a sequence of at least two items without binding the remaining items.
Mapping patterns: {"bandwidth": b, "latency": l} extracts the "bandwidth" and "latency" values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard **rest is also supported. (But **_ would be redundant, so it not allowed.)
Subpatterns may be extracted using the walrus (:=) operator: case (Point(x1, y1), p2 := Point(x2, y2)): ...
Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction: RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(")
Classes can customize how they are matched by defining a __match__() method. Read the PEP for details.
Wow, I totally didn't see this coming, not after seeing what seems like a lot of rejected ideas on this topic (there was at least one PEP already that proposed this, right?). I have to admire the authors' determination to write such a lengthy and (from skimming it) complex and comprehensive proposal *and* providing a reference implementation on top of that, the amount of work (including internal bikeshedding) must've been substantial. Needless to say it's +1 from my humble person, big time, and I wouldn't want the comment below to detract from that. So, now for the one thing that makes me unhappy: the rejected idea to make it an expression. In my short experience with pattern matching, mainly in Rust, roughly half (very vague estimate) of its usefulness came from it being an expression. It's even small things like let i = match i { 9 => 10, 10 => 9, _ => i, }; and let mut file: Box<Write> = match filename.as_ref() { "-" => Box::new(io::stdout()), _ => Box::new(File::create(filename).expect("Cannot open file for writing")), }; and it adds up. I'm not sure how to approach this with Python syntax and I'll think about this, but I feel that it'd be a huge missed opportunity to not have this. Jakub
(Jakub, next time please trim the original post from your quote to what's necessary.) On Wed, Jun 24, 2020 at 9:11 AM <jakub@stasiak.at> wrote:
Wow, I totally didn't see this coming, not after seeing what seems like a lot of rejected ideas on this topic (there was at least one PEP already that proposed this, right?). I have to admire the authors' determination to write such a lengthy and (from skimming it) complex and comprehensive proposal *and* providing a reference implementation on top of that, the amount of work (including internal bikeshedding) must've been substantial.
Needless to say it's +1 from my humble person, big time, and I wouldn't want the comment below to detract from that.
So, now for the one thing that makes me unhappy: the rejected idea to make it an expression. In my short experience with pattern matching, mainly in Rust, roughly half (very vague estimate) of its usefulness came from it being an expression. It's even small things like
let i = match i { 9 => 10, 10 => 9, _ => i, };
and
let mut file: Box<Write> = match filename.as_ref() { "-" => Box::new(io::stdout()), _ => Box::new(File::create(filename).expect("Cannot open file for writing")), };
and it adds up. I'm not sure how to approach this with Python syntax and I'll think about this, but I feel that it'd be a huge missed opportunity to not have this.
We considered it, but it simply doesn't work, for the same reason that we haven't been able to find a suitable multi-line lambda expression. Since Python fundamentally is not an expression language, this is no great loss -- you simply write a match statement that assigns a value to the variable in each branch. Alternatively, the match could be inside a function and each block could return a value. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 24 Jun 2020, at 19:14, Guido van Rossum <guido@python.org> wrote:
(Jakub, next time please trim the original post from your quote to what's necessary.)
Apologies, unintentional, I was replying from the web interface since I wasn’t subscribed to this gorup when the thread was started and I clicked some wrong buttons.
So, now for the one thing that makes me unhappy: the rejected idea to make it an expression. In my short experience with pattern matching, mainly in Rust, roughly half (very vague estimate) of its usefulness came from it being an expression. It's even small things like
let i = match i { 9 => 10, 10 => 9, _ => i, };
and
let mut file: Box<Write> = match filename.as_ref() { "-" => Box::new(io::stdout()), _ => Box::new(File::create(filename).expect("Cannot open file for writing")), };
and it adds up. I'm not sure how to approach this with Python syntax and I'll think about this, but I feel that it'd be a huge missed opportunity to not have this.
We considered it, but it simply doesn't work, for the same reason that we haven't been able to find a suitable multi-line lambda expression. Since Python fundamentally is not an expression language, this is no great loss -- you simply write a match statement that assigns a value to the variable in each branch. Alternatively, the match could be inside a function and each block could return a value.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?)
That’s fair, but there’s a way in which this doesn’t have to be equivalent to multi-line lambda expressions. Granted, I should’ve clarified that I thought about it being an expression in a very limited, special way. Let’s take one example from the PEP text: match greeting: case "": print("Hello!") case name: print(f"Hi {name}!”) Let’s say we allow prepending “match …” with an assignment and the value of the assignment is the value of the last statement/expression in the block that’s selected, this allows for the following hypothetical code: message = match greeting: case "": "Hello!" case name: f"Hi {name}!” So I didn’t express this clearly – it’s not a bout full-blown match expressions but rather an optional "assigning form” of match statements. This seems like it wouldn’t affect parsing massively. Jakub
Thanks a lot for making this. I've been keeping excited since I heard this several hours ago! I'm a researcher(and also a student) in some field dedicated in the study of programming language constructs, including pattern matching. **Python Pattern Matching** is something special to me, which finally shaped the route of my life. I'd say the design is quite clean and impressive, however still, I found many issues, and I wrote a blog post for this, in order to present my points clearly to you promoters and developers of PEP 622: https://thautwarm.github.io/Site-32/Design/PEP622-1.html The summary of the key points in my blog post: 1. There is a scoping issue which is not specified to be solved in the specification of PEP 622, and can be a dangerous bug. 2. The reason for accepting AND patterns, and its use case for enhancing the composability of programs. 3. Guards as patterns can be useful for pattern matching in Python. 4. An alternative '__match__' protocol which can be beneficial. 5. Reason for voting 'else' clause, just like Ethan and other kind people proposed. I also feel like to implement PEP 622, and I'm familiar with steps concerning the implementation.
On 25/06/20 3:31 am, Taine Zhao wrote:
In your AND pattern example, you seem to envisage there being a Urlopen class whose __match__ method has the side effect of opening the URL. This makes it much more than just a pattern to be matched, as it's also doing some of the work of your program. I would say this goes against the spirit of what the PEP proposes, which is that patterns should just be passive declarations of things to look for and should not have side effects. That's not to say there are no uses for deconstructing the same object more than one way, but your case would be more persuasive if you could come up with a side-effect-free example. -- Greg
El mar., 23 jun. 2020 a las 9:10, Guido van Rossum (<guido@python.org>) escribió:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Thanks to Guido and all the others working on this! It's going to be a great addition to the language.
One piece of bikeshedding: I agree with the previous posters who said that the ".x" syntax for referring to variables isn't great, and I'd prefer marking variables that are being assigned to with a special symbol. So instead of: y = 3 case Point(x, .y): ... # x is assigned to, y is looked up we'd have y = 3 case Point($x, y): ... # x is assigned to, y is looked up The trouble with the current syntax is that if you forget the ".", you always get a hard-to-detect bug: your pattern unexpectedly matches and "y" suddenly has a different value. Even if you find the bug, it's hard to find out where exactly the mistake happened. But with the proposed "$" syntax, if you forget the "$", you probably will just immediately get a NameError that will tell you exactly where your bug is. (Except of course if you happen to already have the name "x" in scope, but that's hopefully not very common, and it's already what happens if you typo a local variable name.)
Thank you Guido, Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin for this fun and very useful new feature. I do enjoy pattern matching a lot in Elixir—my favorite language these days, after Python. I don't want to start a discussion, but I just want to say that as an instructor I fear this core language addition may make the language less approachable to the non-IT professionals, researchers etc. who have saved Python from the decline that we can observe happening in Ruby—a language of similar age, with similar strengths and weaknesses, but never widely adopted outside of the IT profession. After I wrap up Fluent Python 2e (which is aimed at professional developers) I hope I can find the time to tackle the challenge of creating introductory Python content that manages to explain pattern matching and other recent developments in a way that is accessible to all. Cheers, Luciano On Tue, Jun 23, 2020 at 1:04 PM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg
On Wed, Jun 24, 2020 at 1:38 PM Luciano Ramalho <luciano@ramalho.org> wrote:
Thank you Guido, Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin for this fun and very useful new feature.
I do enjoy pattern matching a lot in Elixir—my favorite language these days, after Python.
I don't want to start a discussion, but I just want to say that as an instructor I fear this core language addition may make the language less approachable to the non-IT professionals, researchers etc. who have saved Python from the decline that we can observe happening in Ruby—a language of similar age, with similar strengths and weaknesses, but never widely adopted outside of the IT profession.
After I wrap up Fluent Python 2e (which is aimed at professional developers) I hope I can find the time to tackle the challenge of creating introductory Python content that manages to explain pattern matching and other recent developments in a way that is accessible to all.
Hold on a while. This feature does not exist. This PEP has not been accepted. Don't count your chickens.py before they hatch. The last P in PEP stands for Proposal for a reason. Rejection is a perfectly valid option. As is a significant reworking of the proposal so that it doesn't turn into a nightmare. This one needs a lot of work at a minimum. As proposed, it is extremely non-approachable and non-trivial. -gps
You got everything right the first time ;-) The PEP is an extended illustration of "although that way may not be obvious at first unless you're Dutch". I too thought "why not else:?" at first. But "case _:" covers it in the one obvious way after grasping how general wildcard matches are. Introducing "else:" too would be adding a wart (redundancy) just to stop shallow-first-impression whining. "|" is also a fine way to express alternatives. "case" has its own sub-language with its own rules, and "|" is widely used to express alternatives (whether in regexps, formal grammars, ...). Spell it, e.g., "or", and then I wonder "what does short-circuiting have to do with it?". All reuse of symbols carries baggage. ".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value. When context alone can't distinguish whether a name is meant as (in effect) an lvalue or an rvalue, no syntax decorations can prevent coding errors. Names in destructuring constructs are overwhelmingly intended as lvalues, so adding extra cruft to say "no, I meant rvalue" is the pragmatic choice.
I'm also not a big fan of the 'else' proposal. Context matters and in the context of match expressions a case with a wildcard pattern makes sense, an else, not so much.
On 6/24/2020 1:49 PM, Tim Peters wrote:
".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value. When context alone can't distinguish whether a name is meant as (in effect) an lvalue or an rvalue, no syntax decorations can prevent coding errors. Names in destructuring constructs are overwhelmingly intended as lvalues, so adding extra cruft to say "no, I meant rvalue" is the pragmatic choice. This is just a bunch of words to me, without meaning.
I'd like to understand it. What do you mean by "the idea that dotted names are always constant value patterns"? What do you mean by 'extends (the above) to "if and only if" '? As a result of not understanding the above, I see no mnemonic value. My understanding of the "." as proposed is that it is optional, except in cases where it would be ambiguous... seems like it would be better if it were required for one case or the other, so that there would be no need to determine whether or not it is ambiguous by the surrounding state/code/declarations.
[Tim]
".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value. When context alone can't distinguish whether a name is meant as (in effect) an lvalue or an rvalue, no syntax decorations can prevent coding errors. Names in destructuring constructs are overwhelmingly intended as lvalues, so adding extra cruft to say "no, I meant rvalue" is the pragmatic choice.
[Glenn Linderman <v+python@g.nevcal.com>]
This is just a bunch of words to me, without meaning.
I'd like to understand it.
Did you read the PEP?
What do you mean by "the idea that dotted names are always constant value patterns"?
Under the PEP's "Constant Value Pattern" section: Every dotted name in a pattern is looked up using normal Python name resolution rules, and the value is used for comparison by equality with the matching expression (same as for literals). That's what I meant. "Contains a dot" implies "constant value pattern".
What do you mean by 'extends (the above) to "if and only if" '?
Because the next sentence from the PEP: As a special case to avoid ambiguity with name patterns, simple names must be prefixed with a dot to be considered a reference: completes turning "contains a dot" into a necessary and sufficient ("if and only if") condition for distinguishing a constant value pattern from a name pattern. Where "constant value pattern" and "name pattern" are again used with the PEP's meanings.
As a result of not understanding the above, I see no mnemonic value.
While I do. "If I want `xyz` to be interpreted as a constant value pattern, it needs to contain a dot: `.xyy` should do it. If I want `enums.HI` to be interpreted as a constant value, it already contains a dot, so it will be."
My understanding of the "." as proposed is that it is optional, except in cases where it would be ambiguous... seems like it would be better if it were required for one case or the other, so that there would be no need to determine whether or not it is ambiguous by the surrounding state/code/declarations.
A dot is required, when and only when you want the chunk of syntax to be interpreted as a constant value pattern. I said nothing at all about "_leading_" dots, which appear to be all you have in mind there. _Some_ dot is mandatory to make it a constant value pattern; a _leading_ dot may or may not be required.
On 06/24/2020 01:49 PM, Tim Peters wrote:
I too thought "why not else:?" at first. But "case _:" covers it in the one obvious way after grasping how general wildcard matches are.
"case _:" is easy to miss -- I missed it several times reading through the PEP.
Introducing "else:" too would be adding a wart (redundancy) just to stop shallow-first-impression whining.
Huh. I would consider "case _:" to be the wart, especially since "case default:" or "case anything:" or "case i_dont_care:" all do basically the same thing (although they bind to the given name, while _ does not bind to anything, but of what practical importance is that?) .
"|" is also a fine way to express alternatives. "case" has its own sub-language with its own rules, and "|" is widely used to express alternatives (whether in regexps, formal grammars, ...). Spell it, e.g., "or", and then I wonder "what does short-circuiting have to do with it?". All reuse of symbols carries baggage.
Well, the PEP says the alternatives are short-circuiting, so it's okay if you notice. ;-) Besides which, if we use "|" instead of "or" then we can't later allow more general expressions.
".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value.
How do you get from "." to "iff" ? -- ~Ethan~
[Ethan Furman <ethan@stoneleaf.us>]
"case _:" is easy to miss -- I missed it several times reading through the PEP.
As I said, I don't care about "shallow first impressions". I care about how a thing hangs together _after_ climbing its learning curve - which in this case is about a nanometer tall ;-) You're not seriously going to maintain that you're struggling to grasp the meaning of "case _:" now, right?
Huh. I would consider "case _:" to be the wart, especially since "case default:" or "case anything:" or "case i_dont_care:" all do basically the same thing (although they bind to the given name,
Having climbed the trivial learning curve, only a deliberate wise ass would code garbage like "case i_dont_care:". I don't care about them either. The one obvious way to do it has already been made clear to them. You may as well, e.g., complain that there's nothing to stop a wise ass from writing "-5+6" where everyone else writes "+1".
while _ does not bind to anything, but of what practical importance is that?) .
One obvious way to do it is of major practical importance.
... Besides which, if we use "|" instead of "or" then we can't later allow more general expressions.
Good! The PEP is quite complicated enough already. But if you want to pursue this seriously, you're going to have your work cut for you to explain why "|" is more sacred than "or" with respect to "more general expressions". If you don't want to preclude anything, then you need to invent syntax that has no meaning at all now.
".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value.
How do you get from "." to "iff" ?
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if and only if" means.
On 25/06/2020 00:54, Tim Peters wrote:
[Ethan Furman <ethan@stoneleaf.us>]
"case _:" is easy to miss -- I missed it several times reading through the PEP.
As I said, I don't care about "shallow first impressions". I care about how a thing hangs together _after_ climbing its learning curve - which in this case is about a nanometer tall ;-)
You're not seriously going to maintain that you're struggling to grasp the meaning of "case _:" now, right?
I'm seriously going to maintain that I will forget the meaning of "case _:" quickly and regularly, just as I quickly and regularly forget to use "|" instead of "+" for set union. More accurately, I will quickly and regularly forget that in this one place, "_" is special.
while _ does not bind to anything, but of what practical importance is that?) .
One obvious way to do it is of major practical importance.
Yeah, but the "obvious" is being contended, and saying "but it's obvious" doesn't really constitute an argument to those of us for whom it isn't obvious.
".NAME" grated at first, but extends the idea that dotted names are always constant value patterns to "if and only if". So it has mnemonic value.
How do you get from "." to "iff" ?
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if
case long.chain.of.attributes: or more likely case (foo.x, foo.y) for the first. For the second, it's a no-brainer that you can't have a non-dotted name as a constant value pattern, since the current constant value pattern mandates a leading dot. -- Rhodri James *-* Kynesim Ltd
[Rhodri James <rhodri@kynesim.co.uk>]
I'm seriously going to maintain that I will forget the meaning of "case _:" quickly and regularly,
Actually, you won't - trust me ;-)
just as I quickly and regularly forget to use "|" instead of "+" for set union. More accurately, I will quickly and regularly forget that in this one place, "_" is special.
Because that's the opposite of "accurate". There's nothing special about "_" "in this one place". It's but a single application of that "_" is used as a wildcard in _all_ matching contexts throughout the PEP. And it's not even new with this PEP. "_" is routinely used already in lots of code to mean "the syntax requires a binding target here, but I don't care about the binding", from lists = [[] for _ in range(100)] to first, _, third = triple The last is especially relevant, because that's already a form of destructuring. The only thing new about this use of "_" in the PEP is that it specifies no binding will occur. Binding does occur in the examples above (because there's nothing AT ALL special about "_" now - it's just a one-character identifier, and all the rest is convention, including that the REPL uses it to store the value of the last-displayed object).
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if
case long.chain.of.attributes:
That's a dotted name and so is a constant value pattern - read the PEP. Every dotted name in a pattern is looked up using normal Python name resolution rules, and the value is used for comparison by equality with the matching expression (same as for literals).
or more likely
case (foo.x, foo.y)
Ditto.
for the first. For the second, it's a no-brainer that you can't have a non-dotted name as a constant value pattern, since the current constant value pattern mandates a leading dot.
Not so. _Solme_ dot is necessary and sufficient to identify a constant value pattern now. A leading dot is only _required_ in case an intended constant value pattern would have no dots otherwise.
On 25/06/2020 15:40, Tim Peters wrote:
[Rhodri James <rhodri@kynesim.co.uk>]
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if
case long.chain.of.attributes:
That's a dotted name and so is a constant value pattern - read the PEP.
Every dotted name in a pattern is looked up using normal Python name resolution rules, and the value is used for comparison by equality with the matching expression (same as for literals).
Then I am surprised, which is worse. "long.chain.of.attributes" looks like an assignment target, and I would have expected the case to have been a name pattern. -- Rhodri James *-* Kynesim Ltd
[Tim]
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if
[Rhodri James <rhodri@kynesim.co.uk>]
case long.chain.of.attributes:
[Tim]
That's a dotted name and so is a constant value pattern - read the PEP.
Every dotted name in a pattern is looked up using normal Python name resolution rules, and the value is used for comparison by equality with the matching expression (same as for literals).
[Rhodri]
Then I am surprised, which is worse. "long.chain.of.attributes" looks like an assignment target, and I would have expected the case to have been a name pattern.
As always, I don't care whether something is obvious at first glance. I care whether something can be learned with reasonable effort, and "sticks" _after_ it's learned. There's essentially nothing truly obvious about programming. This, from the PEP, is the entire grammar for a "name pattern'" name_pattern: NAME !('.' | '(' | '=') That's it. A plain name not followed by a dot, left paren, or equality sign. While it may or may not surprise any given person at first glance, it's very simple. Put a fraction of the effort into learning it as you're willing to expend on articulating surprise, and it would already be far behind you ;-)
On 25/06/2020 16:48, Tim Peters wrote:
[Tim]
See reply to Glenn. Can you give an example of a dotted name that is not a constant value pattern? An example of a non-dotted name that is? If you can't do either (and I cannot)), then that's simply what "if
[Rhodri James <rhodri@kynesim.co.uk>]
case long.chain.of.attributes:
[Tim]
That's a dotted name and so is a constant value pattern - read the PEP.
Every dotted name in a pattern is looked up using normal Python name resolution rules, and the value is used for comparison by equality with the matching expression (same as for literals).
[Rhodri]
Then I am surprised, which is worse. "long.chain.of.attributes" looks like an assignment target, and I would have expected the case to have been a name pattern.
As always, I don't care whether something is obvious at first glance. I care whether something can be learned with reasonable effort, and "sticks" _after_ it's learned. There's essentially nothing truly obvious about programming.
This, from the PEP, is the entire grammar for a "name pattern'"
name_pattern: NAME !('.' | '(' | '=')
That's it. A plain name not followed by a dot, left paren, or equality sign.
While it may or may not surprise any given person at first glance, it's very simple. Put a fraction of the effort into learning it as you're willing to expend on articulating surprise, and it would already be far behind you ;-)
Well, now is the time for expressing surprise :-p As I've said before, one of my main problems with the PEP is as you go through it, more and more special cases and surprises appear, and the consequences of earlier surprises generate more special cases and surprises. You claim not unreasonably that it's easy to remember that "_" is special in matches. Maybe you're right, but that decision has consequences spelled out later in the PEP that are less easy to remember. Another example: I had not previously thought the definition of name patterns to be surprising, but apparently they are (it just surprised me, at any rate). That consequently makes the definition of constant value patterns, which I was already iffy about, really quite surprising. Each individual learning curve might be small, but cumulative total by the time you reach the end of the PEP is large. Simple match statements will, with adequate squinting, look recognisably like other areas of Python. Complex match statements won't. And that's a problem for anyone who wants to be able to read someone else's code. Bear in mind I am predominantly a C programmer who uses Python from time to time for tools and glue. If I have to put in effort to learn new special-case rules in Python, that's an active discouragement; I'm frankly unlikely to bother, and more likely to write those tools and glue in C instead. I'm certainly much less likely to use someone else's tools and glue if I have to re-read the spec to remind myself what all the gotchas are. -- Rhodri James *-* Kynesim Ltd
On Thu, Jun 25, 2020 at 9:21 AM Rhodri James <rhodri@kynesim.co.uk> wrote:
Well, now is the time for expressing surprise :-p
As I've said before, one of my main problems with the PEP is as you go through it, more and more special cases and surprises appear, and the consequences of earlier surprises generate more special cases and surprises. You claim not unreasonably that it's easy to remember that "_" is special in matches. Maybe you're right, but that decision has consequences spelled out later in the PEP that are less easy to remember. Another example: I had not previously thought the definition of name patterns to be surprising, but apparently they are (it just surprised me, at any rate). That consequently makes the definition of constant value patterns, which I was already iffy about, really quite surprising.
Each individual learning curve might be small, but cumulative total by the time you reach the end of the PEP is large. Simple match statements will, with adequate squinting, look recognisably like other areas of Python. Complex match statements won't. And that's a problem for anyone who wants to be able to read someone else's code.
Bear in mind I am predominantly a C programmer who uses Python from time to time for tools and glue. If I have to put in effort to learn new special-case rules in Python, that's an active discouragement; I'm frankly unlikely to bother, and more likely to write those tools and glue in C instead. I'm certainly much less likely to use someone else's tools and glue if I have to re-read the spec to remind myself what all the gotchas are.
On my personal "potentially inscrutable uses of a tool" this still rates well below list comprehensions, so there's that; the biggest pet peeve I have anymore is understanding at a glance what is and isn't an assignment. This is a draft PEP and a lot of discussion around making assignment vs matched classes more explicit, so it's not like this is going to be set in stone, and I doubt that most will ever use the more esoteric parts of the syntax. One way or another, this is going to be a far more capable, and thus complex, tool than a switch statement, so there's only so much obviousness you can ask for coming in blind.
On 25/06/2020 23:20, Emily Bowman wrote:
On my personal "potentially inscrutable uses of a tool" this still rates well below list comprehensions, so there's that; the biggest pet peeve I
Clearly YMMV. To me list comprehensions like "[f(x) for x in l]" were obviously related to the "f(x) ∀ x ∊ l" familiar from my maths degree.
have anymore is understanding at a glance what is and isn't an assignment.
Yes, this does seem to be a lot of people's problem. My point is we get to that position one step at a time, so maybe we should be examining the first steps in that chain and re-evaluating whether they were in fact the right ones, given where we end up. I accept the PEP's general point that name patterns will be common, but I don't think something like "case int as x:" is hard to write and it brings in the idea that we are talking about types right at the start. -- Rhodri James *-* Kynesim Ltd
On 26/06/20 1:18 am, Rhodri James wrote:
I will quickly and regularly forget that in this one place, "_" is special.
You don't have to remember that it's special to understand what 'case _' does. Even if it were treated as an ordinary name, it would still have the effect of matching anything. -- Greg
On 25/06/2020 15:42, Greg Ewing wrote:
On 26/06/20 1:18 am, Rhodri James wrote:
I will quickly and regularly forget that in this one place, "_" is special.
You don't have to remember that it's special to understand what 'case _' does. Even if it were treated as an ordinary name, it would still have the effect of matching anything.
Maybe. It's still ugly. -- Rhodri James *-* Kynesim Ltd
On 6/25/20 10:42 AM, Greg Ewing wrote:
On 26/06/20 1:18 am, Rhodri James wrote:
I will quickly and regularly forget that in this one place, "_" is special.
You don't have to remember that it's special to understand what 'case _' does. Even if it were treated as an ordinary name, it would still have the effect of matching anything.
Actually, you could make _ less special by still binding the value to it, just make it special in that you allow several values to be bound, and maybe just define that the result will be just one of the values, maybe even specify which if you want. -- Richard Damon
On Thu, Jun 25, 2020 at 3:41 PM Richard Damon <Richard@damon-family.org> wrote:
Actually, you could make _ less special by still binding the value to
it, just make it special in that you allow several values to be bound, and maybe just define that the result will be just one of the values, maybe even specify which if you want.
Like Guido said above, the problem is that _ is already effectively reserved for translated text. Combining the two would feel a bit weird, but should still be possible.
On 6/25/20 6:48 PM, Emily Bowman wrote:
On Thu, Jun 25, 2020 at 3:41 PM Richard Damon <Richard@damon-family.org <mailto:Richard@damon-family.org>> wrote:
Actually, you could make _ less special by still binding the value to
it, just make it special in that you allow several values to be bound, and maybe just define that the result will be just one of the values, maybe even specify which if you want.
Like Guido said above, the problem is that _ is already effectively reserved for translated text. Combining the two would feel a bit weird, but should still be possible.
I thought _ was also commonly used as: first, -, last = (1, 2, 3) as a generic don't care about assignment. I guess since the above will create a local, so not overwrite a 'global' function _ for translations, so the above usage works as long as that function (or whatever namespace you are in) doesn't use _ for translations. As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction. -- Richard Damon
On Thu, Jun 25, 2020 at 8:31 PM Richard Damon <Richard@damon-family.org> wrote:
I thought _ was also commonly used as:
first, -, last = (1, 2, 3)
as a generic don't care about assignment. I guess since the above will create a local, so not overwrite a 'global' function _ for translations, so the above usage works as long as that function (or whatever namespace you are in) doesn't use _ for translations. As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction.
The PEP currently says: "The named class must inherit from type. It may be a single name or a dotted name (e.g. some_mod.SomeClass or mod.pkg.Class). The leading name must not be _, so e.g. _(...) and _.C(...) are invalid. Use object(foo=_) to check whether the matched object has an attribute foo."
Whoops, meant to reply to Gregory on that one, sorry Richard. On Thu, Jun 25, 2020 at 7:15 PM Gregory P. Smith <greg@krypto.org> wrote:
Can I use an i18n'd _("string") within a case without jumping through hoops to assign it to a name before the match:?
The PEP currently says:
"The named class must inherit from type. It may be a single name or a dotted name (e.g. some_mod.SomeClass or mod.pkg.Class). The leading name must not be _, so e.g. _(...) and _.C(...) are invalid. Use object(foo=_) to check whether the matched object has an attribute foo."
Richard Damon writes:
I thought _ was also commonly used as:
first, -, last = (1, 2, 3)
as a generic don't care about assignment.
It is. But there are other options (eg, 'ignored') if '_' is used for translation in the same scope.
I guess since the above will create a local, so not overwrite a 'global' function _ for translations, so the above usage works as long as that function (or whatever namespace you are in) doesn't use _ for translations.
Exactly.
As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction.
It's quite different. First, it surely won't make other symbols match-local. Of course there will be times when you do all the work inside the match statement. But often you'll want to do bindings in a match statement, then use those outside. The second problem is that this use of '_' isn't optional. It's part of the syntax. That means that you can't use the traditional marking of a translateable string (and it's not just tradition; there is a lot of external software that expects it) in that scope. So it's practically important, if not theoretically necessary, that 'case _' not bind '_'. Steve
On 6/27/20 5:36 AM, Stephen J. Turnbull wrote:
Richard Damon writes:
I thought _ was also commonly used as:
first, -, last = (1, 2, 3)
as a generic don't care about assignment.
It is. But there are other options (eg, 'ignored') if '_' is used for translation in the same scope.
I guess since the above will create a local, so not overwrite a 'global' function _ for translations, so the above usage works as long as that function (or whatever namespace you are in) doesn't use _ for translations.
Exactly.
As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction.
It's quite different. First, it surely won't make other symbols match-local. Of course there will be times when you do all the work inside the match statement. But often you'll want to do bindings in a match statement, then use those outside. The second problem is that this use of '_' isn't optional. It's part of the syntax. That means that you can't use the traditional marking of a translateable string (and it's not just tradition; there is a lot of external software that expects it) in that scope.
So it's practically important, if not theoretically necessary, that 'case _' not bind '_'.
Steve
I wasn't imply local to the match statement, but if the match is used inside a function, where using the binding operatior = will create a local name, even if there is a corresponding global name that matches (unless you use the global statement), will a match statement that binds to a name that hasn't bee made a local name by having an explicit assignment to it, actually bind to a global that might be present, or will it create a local? My first feeling is that binding to the global would be surprising. i.e. foo = 1 def bar(baz): match baz: case 1: print('baz was one') case foo: print('baz was ', foo) bar(2) print(foo) will this script create a new foo name inside bar, so that when we return, the module global foo is still 1, or did be bind to the global and change it? Rebinding a global without a global statement would be unexpected (normally we can mutate the global, but not rebind it) -- Richard Damon
Em sáb., 27 de jun. de 2020 às 11:12, Richard Damon < Richard@damon-family.org> escreveu:
On 6/27/20 5:36 AM, Stephen J. Turnbull wrote:
Richard Damon writes:
I thought _ was also commonly used as:
first, -, last = (1, 2, 3)
as a generic don't care about assignment.
It is. But there are other options (eg, 'ignored') if '_' is used for translation in the same scope.
I guess since the above will create a local, so not overwrite a 'global' function _ for translations, so the above usage works as long as that function (or whatever namespace you are in) doesn't use _ for translations.
Exactly.
As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction.
It's quite different. First, it surely won't make other symbols match-local. Of course there will be times when you do all the work inside the match statement. But often you'll want to do bindings in a match statement, then use those outside. The second problem is that this use of '_' isn't optional. It's part of the syntax. That means that you can't use the traditional marking of a translateable string (and it's not just tradition; there is a lot of external software that expects it) in that scope.
So it's practically important, if not theoretically necessary, that 'case _' not bind '_'.
Steve
I wasn't imply local to the match statement, but if the match is used inside a function, where using the binding operatior = will create a local name, even if there is a corresponding global name that matches (unless you use the global statement), will a match statement that binds to a name that hasn't bee made a local name by having an explicit assignment to it, actually bind to a global that might be present, or will it create a local? My first feeling is that binding to the global would be surprising.
i.e.
foo = 1
def bar(baz):
match baz:
case 1: print('baz was one')
case foo: print('baz was ', foo)
bar(2)
print(foo)
will this script create a new foo name inside bar, so that when we return, the module global foo is still 1, or did be bind to the global and change it?
Rebinding a global without a global statement would be unexpected (normally we can mutate the global, but not rebind it)
-- Richard Damon _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/A2YKBTEH... Code of Conduct: http://python.org/psf/codeofconduct/
I think that global binding make no sense, it will break a lot of code silently, think about this def bar(baz): match baz: case bar: pass IMHO, the most obvious solution is that the bind should be available only inside case block and if you need to change a global or a nonlocal you do this explicitly inside the case block, if this is the case you can pickup a bind name that doesn't shadow the desired variable. This way the intention to overwrite a global/nonlocal is clear in code -- “If you're going to try, go all the way. Otherwise, don't even start. ..." Charles Bukowski
Daniel. writes:
IMHO, the most obvious solution is that the bind should be available only inside case block and if you need to change a global or a nonlocal you do this explicitly inside the case block,
Do you mean case x: x = x ?
if this is the case you can pickup a bind name that doesn't shadow the desired variable.
But you can already do that in the case clause itself.
This way the intention to overwrite a global/nonlocal is clear in code
But you can't "overwrite" a global or nonlocal without a declaration. You can only shadow it, and as I wrote above, if you want to avoid shadowing that global or nonlocal you choose a different name in the case clause. Regarding the proposal itself, it's a major change, because that's not the way binding works anywhere else in Python. There are two differences. The first is that there are no statements besides class and def that create scopes, and the other two scopes that I know of are module and comprehension (I seem to recall that comprehension scope is based on the idea that the code inside the brackets is actually syntactic sugar for a generator def, so that special case would be derivative of function scope). The second difference is that you can't change the binding of a name in an outer local scope without a nonlocal declaration. That means if case scope follows that rule and you want to have a binding inside the case scope that persists outside it, you'd need to declare it. Probably you don't want to have to declare those "external" names, so the names bound by the case clause would be very special. I haven't thought carefully about it, but it seems to me this could be a bug magnet. So, you could do this, but it would make scoping much more complex than it currently is, and require a lot of redundancy (every binding you want to preserve for later use would need to be made twice, once in the case clause and again in the case body). And I don't think it's that useful. It doesn't help with the I18N problem, since you might want to use a marked string in the "case _" suite. And the main complexity from the current scoping rules comes from the fact that it seems likely that different arms of the match will bind different sets of names, so NameErrors become more likely in the following code. But your proposal doesn't help with that, and may make it more likely by proliferating names. Regards, Steve
Richard Damon writes:
I wasn't imply local to the match statement, but if the match is used inside a function, where using the binding operatior = will create a local name, even if there is a corresponding global name that matches (unless you use the global statement), will a match statement that binds to a name that hasn't bee made a local name by having an explicit assignment to it, actually bind to a global that might be present, or will it create a local?
All names are global, in some relevent sense. It's the bindings to objects that are scoped, and a binding in an inner scope shadows that of an outer scope. This just works, as I'm sure you've experienced. In this case, the match statement will create a binding in the current scope (in the case you present, local scope for that function). The problem for internationalization is not your example:
match baz:
case 1: print('baz was one')
case foo: print('baz was ', foo)
but this kind of situation: match baz: case 1: print(_('baz was one')) case _: print(_('baz was '), _(foo)) where _() marks a string that should be translated to another language, and also implements the lookup at runtime. If "case _" binds "_", the _() in the print statement in that arm of the match will very likely raise, and later _() will as well, until the end of the scope. It's very unlikely it will do what's desired! Do translatable strings have to be marked with _()? In theory, no, in fact "_" is an alias for the gettext function, which could also be used. But in practice the marking aspect is used by a wide variety of translation support software searching for strings to translate, and it's also important to the readability of strings in the source that the mark be as lightweight as possible. So for internationalization it's useful that "case _" does not bind an object to the name "_". It's a very special case, and it's fortunate that it works out this way that there's no conflict between the two uses of "_". Or maybe Lady Fortune is Dutch. :-) Steve
On 27/06/2020 10:36, Stephen J. Turnbull wrote:
Richard Damon writes:
As long as the bindings in match also make the symbol a local (which seems reasonable) then you would get a similar restriction.
It's quite different. First, it surely won't make other symbols match-local. Of course there will be times when you do all the work inside the match statement. But often you'll want to do bindings in a match statement, then use those outside. The second problem is that this use of '_' isn't optional. It's part of the syntax. That means that you can't use the traditional marking of a translateable string (and it's not just tradition; there is a lot of external software that expects it) in that scope.
So it's practically important, if not theoretically necessary, that 'case _' not bind '_'.
That's the clearest explanation of why "_" needs to be treated carefully, but I don't think it argues for the PEP's special treatment. Those people like me who just write for ourselves and don't care about internationalisation use "_" like any other variable with a strong implication that it's a dummy, so don't really care. Those people like you who care about internationalisation presumably avoid using "_" anyway, so the PEP's usage goes against your current instincts. -- Rhodri James *-* Kynesim Ltd
Rhodri James writes:
That's the clearest explanation of why "_" needs to be treated carefully, but I don't think it argues for the PEP's special treatment.
That depends on whether you care about taking advantage of the convention that "_" is a dummy. In fact, _ = gettext partakes of that convention: all the programmer need know about internationalization is that non-English speakers might like to read the string in their own language. From her point of view, _() is a no-op aka dummy.
Those people like me who just write for ourselves and don't care about internationalisation use "_" like any other variable with a strong implication that it's a dummy, so don't really care. Those people like you who care about internationalisation presumably avoid using "_" anyway, so the PEP's usage goes against your current instincts.
I can't speak for others, but I use "_" as a dummy all the time. Of course that means I need to take care to use a different convention in code that assumes _ == gettext, but it's rarely needed in my experience. But if the use of _ as a dummy in "case _" becomes syntax, I can't use a different dummy, can I? Or can I use a different dummy (such as "xx" or "__") at the expense of binding it? With the non-binding treatment of "case _", I don't have to worry about it. Steve
On Thu, 25 Jun 2020, Richard Damon wrote:
On 6/25/20 10:42 AM, Greg Ewing wrote:
On 26/06/20 1:18 am, Rhodri James wrote:
I will quickly and regularly forget that in this one place, "_" is special.
You don't have to remember that it's special to understand what 'case _' does. Even if it were treated as an ordinary name, it would still have the effect of matching anything.
Actually, you could make _ less special by still binding the value to it, just make it special in that you allow several values to be bound, and maybe just define that the result will be just one of the values, maybe even specify which if you want.
We already allow (x, x) = (1, 2) So, why do we need to disallow binding several values to the same name ? Without the restriction, there's no need for _ to be special, and anyone using _ for something else, can use some other dummy for matching. /Paul
On 26/06/20 1:08 pm, Paul Svensson wrote:
We already allow (x, x) = (1, 2) So, why do we need to disallow binding several values to the same name ?
I think it was done because people might expect that to match only if the *same* value appears in both places. Some other languages have pattern matching that works that way. I think the intention is to leave open the possibility of implementing it in the future. -- Greg
On 2020-06-24 23:14, Ethan Furman wrote:
On 06/24/2020 01:49 PM, Tim Peters wrote:
I too thought "why not else:?" at first. But "case _:" covers it in the one obvious way after grasping how general wildcard matches are.
"case _:" is easy to miss -- I missed it several times reading through the PEP.
Introducing "else:" too would be adding a wart (redundancy) just to stop shallow-first-impression whining.
Huh. I would consider "case _:" to be the wart, especially since "case default:" or "case anything:" or "case i_dont_care:" all do basically the same thing (although they bind to the given name, while _ does not bind to anything, but of what practical importance is that?) .[snip]
The point of '_' is that it can be used any number of times in a pattern: case (_, _): This is not allowed: case (x, x): When a pattern matches, binding occurs, and why bind to a name when you don't need/want the value?
Ethan Furman writes:
_ does not bind to anything, but of what practical importance is that?
*sigh* English speakers ... mutter ... mutter ... *long sigh* It's absolutely essential to the use of the identifier "_", otherwise the I18N community would riot in the streets of Pittsburgh. Not good TV for Python (and if Python isn't the best TV, what good is it? ;-) Steve
On Wed, Jun 24, 2020 at 7:58 PM Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Ethan Furman writes:
_ does not bind to anything, but of what practical importance is that?
*sigh* English speakers ... mutter ... mutter ... *long sigh*
It's absolutely essential to the use of the identifier "_", otherwise the I18N community would riot in the streets of Pittsburgh. Not good TV for Python (and if Python isn't the best TV, what good is it? ;-)
Can I use an i18n'd _("string") within a case without jumping through hoops to assign it to a name before the match:?
Steve _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ENUPVNZE... Code of Conduct: http://python.org/psf/codeofconduct/
Litmus test: Give someone who does not know Python this code example from the PEP and ask them what it does and why it does what it does: match get_shape(): case Line(start := Point(x, y), end) if start == end: print(f"Zero length line at {x}, {y}") I expect confusion to be the result. If they don't blindly assume the variables come from somewhere not shown to stop their anguish. With Python experience, my own reading is: * I see start actually being assigned. * I see nothing giving values to end, x, or y. * Line and Point are things being called, probably class constructions due to being Capitalized. * But where did the parameter values come from and why and how can end be referred to in a conditional when it doesn't exist yet? They appear to be magic! Did get_shape() return these? (i think not). Something magic and *implicit rather than explicit* happens in later lines. The opposite of what Python is known for. Where's the pseudo-code describing *exactly* what the above looks like logically speaking? (there's a TODO in the PEP for the __match__ protocol code so I assume it will come, thanks!). I can guess _only_ after reading a bunch of these discussions and bits of the PEP. Is it this? I can't tell. shape = get_shape() values_or_none = Line.__match__(shape) if values_or_none: start, end = values_or_none if start == end: if x, y := Point.__match__(shape): print(...) del x, y else: print(...) del start, end else: # ... onto the next case: ? Someone unfamiliar with Python wouldn't even have a chance of seeing that. I had to rewrite the above many times, I'm probably still wrong. That sample is very confusing code. It makes me lean -1 on the PEP overall today. This syntax does not lead to readable logically understandable code. I wouldn't encourage anyone to write code that way because it is not understandable to others. We must never assume others are experts in the language they are working on code in if we want it to be maintainable. I wouldn't approve a code review containing that example. It would help *in part* if ()s were not used to invoke the __match__ protocol. I think a couple others also mentioned this earlier. Don't make it look like a call. Use different tokens than (). Point{x, y} for example. Or some way to use another token unused in that context in our toolbook such as @ to signify "get a matcher for this class" instead of "construct this class". for example ClassName@() as our match protocol indicator, shown here with explicit assignments for clarity: match get_shape() as shape: case start, end := Line@(shape): no implicit assignments, it is clear where everything comes from. it is clear it isn't a constructor call. downside? possibly a painful bug when someone forgets to type the @. but the point of it not being construction needs to be made. not using ()s but instead using ClassName@{} or just ClassName{} would prevent that. The more nested things get with sub-patterns, the worse the confusion becomes. The nesting sounds powerful but is frankly something I'd want to forbid anyone from using when the assignment consequences are implicit. So why implement sub-patterns at all? All I see right now is pain. -gps
On Fri, 26 Jun 2020 at 09:07, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 26/06/20 2:10 pm, Gregory P. Smith wrote:
match get_shape() as shape: case start, end := Line@(shape):
This looks just as inscrutable to me in its own way.
Absolutely, but that's kind of the point I think: no possible way to understand it for something else that what it means. Arthur
On Fri, 26 Jun 2020 at 03:37, Gregory P. Smith <greg@krypto.org> wrote:
Litmus test: Give someone who does not know Python this code example from the PEP and ask them what it does and why it does what it does:
match get_shape(): case Line(start := Point(x, y), end) if start == end: print(f"Zero length line at {x}, {y}")
I expect confusion to be the result. If they don't blindly assume the variables come from somewhere not shown to stop their anguish.
With Python experience, my own reading is:
With Python experience *and limited use of other languages with match constructs*, this reads naturally to me and makes instant sense. Clearly it's unreasonable to expect all users to have multi-language experience, but I'd argue that my experience demonstrates that "limited use of constructs like this" is enough to understand the proposed syntax. And therefore, that once it's added to Python, it won't take long for people to become sufficiently familiar with it to handle fairly complex examples. Paul
On Wed, Jun 24, 2020 at 3:15 PM Ethan Furman <ethan@stoneleaf.us> wrote:
I too thought "why not else:?" at first. But "case _:" covers it in the one obvious way after grasping how general wildcard matches are.
"case _:" is easy to miss -- I missed it several times reading through the PEP.
Introducing "else:" too would be adding a wart (redundancy) just to stop shallow-first-impression whining.
Huh. I would consider "case _:" to be the wart, especially since "case default:" or "case anything:" or "case i_dont_care:" all do basically the same thing (although they bind to the given name, while _ does not bind to anything, but of what practical importance is that?) .
There's always making everyone equally annoyed, and allowing 'case else:' (not a very serious suggestion, though it does have a certain sort of symmetry).
e.g., "or", and then I wonder "what does short-circuiting have to do with it?". All reuse of symbols carries baggage.
"or" brings an intuition of the execution order of pattern matching, just like how people already know about "short-circuiting". "or" 's operator precedence also suggests the syntax of OR patterns. As we have "|" as an existing operator, it seems that there might be cases that the precedence of "|" is not consistent with it in an expression. This will mislead users. You said "All reuse of symbols carries baggage", I'd say, All **inconsistent** reuse of symbols carries baggage, but the consistent reuse builds good intuitive sense and shows the good taste of designers.
[Taine Zhao <yaoxiansamma@gmail.com>]
"or" brings an intuition of the execution order of pattern matching, just like how people already know about "short-circuiting".
"or" 's operator precedence also suggests the syntax of OR patterns.
As we have "|" as an existing operator, it seems that there might be cases that the precedence of "|" is not consistent with it in an expression. This will mislead users.
You said "All reuse of symbols carries baggage", I'd say,
All **inconsistent** reuse of symbols carries baggage, but the consistent reuse builds good intuitive sense and shows the good taste of designers.
We're not talking about abstract computation here: this is a specific feature, and "|" is the _only_ infix operator. The PEP considered and rejected "&" and a unary "not", so that's the universe we're left with. With only one "operator", it's really hard to "mislead" ;-) In any case, the model here is far more regular expressions than Python int arithmetic or set unions. "|" means essentially the same thing in the PEP as it does in Python regexps: tru supatterns one at a time, left to right.
I don't mean to be rude, but I would like to chip in and back up Taine here. The 'or' operator: - Already unambiguously associated with a logical OR, which is effectively what takes place in this circumstance. Using a different symbol to have the same effect is bound to be confusing to a reasonably large number of people. The '|' character: - Saves one character, but I think it would be easier to miss compared to a keyword which is easy to highlight in an editor. - It's existing usages are very context-specific (between two dict objects, within a regex expression) - using it here I think would start to bring it into the language in a more general way. Because of this, I think a more holistic discussion on its place within Python is appropriate before it is reached for as a bespoke solution.
After reading a GitHub discussion on the matter (https://github.com/gvanrossum/patma/issues/93), '|' now makes sense to me instead of 'or': - The use of the '|' character in Python seems to be going towards a union-like operator (dict merging, PEP 604), which is definitely appropriate here. - The clarity of '|' and 'or' can go either way subjectivity, but a use-case I hadn't considered in my previous comment was nested pattern matching, e.g. case (1 | 2 | 3, x) | (4 | 5 | 6, x): case (1 or 2 or 3, x) or (4 or 5 or 6, x): To me, '|' seems at least as clear as 'or' in this case, since it can be read faster and doesn't chew up space. Spaces before and after '|' definitely help. - As Tim noted, 'or' would exist as a solitary logical function in the current setup. Although, a different GitHub discussion could possibly ressurect 'and' (https://github.com/gvanrossum/patma/issues/97).
On Wed, Jun 24, 2020 at 7:34 PM Taine Zhao <yaoxiansamma@gmail.com> wrote:
e.g., "or", and then I wonder "what does short-circuiting have to do with it?". All reuse of symbols carries baggage.
"or" brings an intuition of the execution order of pattern matching, just like how people already know about "short-circuiting".
"or" 's operator precedence also suggests the syntax of OR patterns.
As we have "|" as an existing operator, it seems that there might be cases that the precedence of "|" is not consistent with it in an expression. This will mislead users.
I prefer "or" to "|" as a combining token as there is nothing bitwise going on here. "or" reads better. Which is why Python used it for logic operations in the first place. It is simple English. "|" does not read like or to anyone but a C based language programmer. Something Python users should never need to know. There is no existing pythonic way to write "evaluate all of these things at once in no specific order". And in reality, there will be an order. It'll be sequential, and if it isn't the left to right order that things are written with the "|" between them, it will break someones assumptions and make optimization harder. Some too-clever-for-the-worlds-own-good author is going to implement __match__ classmethods that have side effects and make state changes that impact the behavior of later matcher calls (no sympathy for them). Someone else is going to order them most likely to least likely for performance (we should have sympathy for that). Given we only propose to allow a single trailing guard if per case, using "or" instead of "|" won't be confused with an if's guard condition. -gps
On Jun 25, 2020, at 21:33, Gregory P. Smith <greg@krypto.org> wrote:
I prefer "or" to "|" as a combining token as there is nothing bitwise going on here. "or" reads better. Which is why Python used it for logic operations in the first place. It is simple English. "|" does not read like or to anyone but a C based language programmer. Something Python users should never need to know.
Hasn't "|" been used in a similar way for decades in Unix/POSIX shell patterns, including with the shell case statement? From IEE Std 1003.1 -> Shell Command Language: "The conditional construct case shall execute the compound-list corresponding to the first one of several patterns (see Pattern Matching Notation) that is matched by the string resulting from the tilde expansion, parameter expansion, command substitution, arithmetic expansion, and quote removal of the given word. The reserved word in shall denote the beginning of the patterns to be matched. Multiple patterns with the same compound-list shall be delimited by the '|' symbol. The control operator ')' terminates a list of patterns corresponding to a given action. The compound-list for each list of patterns, with the possible exception of the last, shall be terminated with ";;". The case construct terminates with the reserved word esac (case reversed). The format for the case construct is as follows: case word in [(] pattern1 ) compound-list ;; [[(] pattern[ | pattern] ... ) compound-list ;;] ... [[(] pattern[ | pattern] ... ) compound-list] esac " case ${branch} in 37|3.7) pushd "$HOME/dev/37/source" ;; 38|3.8) pushd "$HOME/dev/38/source" ;; 39|3.9) pushd "$HOME/dev/39/source" ;; 3x|3.x|master) pushd "$HOME/dev/3x/source" ;; *) echo "unknown branch ${branch}" ;; esac -- Ned Deily nad@python.org -- []
On 25/06/20 8:49 am, Tim Peters wrote:
Spell it, e.g., "or", and then I wonder "what does short-circuiting have to do with it?".
Well, it does short-circuit -- if one alternative matches, it doesn't bother with any subsequent ones. Which could be considered an argument for using "or" instead of "|".
Names in destructuring constructs are overwhelmingly intended as lvalues,
That assumes the match statement is going to be used in a predominantly destructuring way, which I don't think is a foregone conclusion. In a destructuring assignment, *all* the names on the LHS are assignment targets, and *none* on the RHS are, which makes it easy to see what is being assigned to. But with these new pattern expressions, either can appear anywhere in any mixture. While trying to grok the PEP, I was really struggling to look at something like Point(x = a) and decide whether x or a or neither was being assigned to. Which makes me think there should be something *really* clear and unambiguous to indicate when assignment is happening. -- Greg
Thank you very much to Brandt, Tobias, Ivan, Guido, and Talin for the extensive work on this PEP. The attention to detail and effort that went into establishing the real-world usefulness of this feature (such as the many excellent examples and code analysis) helps a lot to justify the complexity of the proposed pattern matching feature(s). The concern about the added complexity to the language is certainly reasonable (particularly in the sphere of researchers and other non-developers), but I think this is a case where practicality will ultimately win. Overall, I am +1 for this PEP. That being said, I would like to state a few opinions (with #2 being my strongest one): 1) I was initially in agreement about the usage "else:" instead of "match _:", but upon further consideration, I don't think "else:" holds up very well in the context of pattern matching. Although it could be confusing at a first glance (admittedly, it threw me off at first), an underscore makes far more sense as a wildcard match; especially considering the existing precedent. 2) Regarding the constant value pattern semantics, I'm okay with the usage of the "." in general, but I completely agree with several others that it's rather difficult to read when there's a leading period with a single word, e.g. ".CONSTANT". To some degree, this could probably be less problematic with some reasonably good syntax highlighting to draw attention to the leading period. However, I don't think it should be at all necessary for people to rely on syntax highlighting to be able to clearly see something that's part of a core Python language feature. It seems especially detrimental for those with visual impairment. As someone with relatively poor eye-sight who typically has to blow up the font size for my code to be readable (and often finds syntax highlighting to be distracting), I'm not really looking forward to squinting for missed leading periods when it was intended to refer to a constant reference. Even if it's a relatively uncommon case, with a core feature, it's bound to happen enough to cause some headaches. From the "Rejected Ideas" section:
Introduce a special symbol, for example $ or ^ to indicate that a given name is a constant to be matched against, not to be assigned to:
FOO = 1 value = 0
match value: case $FOO: # This would not be matched ... case BAR: # This would be matched ...
The problem with this approach is that introducing a new syntax for such narrow use-case is probably an overkill.
I can certainly understand that it seems overkill to have a separate symbol for this, but in my biased opinion, I think it's worth a stronger consideration from the perspective of those with some degree of visual impairment. I don't have a strong opinion about the specific larger symbol that should be used instead, but either of "$" or "^" would be perfectly fine by me. I'd be on-board with anything that doesn't have a strong existing purpose in the language. 3) Regarding the 6 before vs. after examples provided by Guido, I have some thoughts on the last one:
Original:
def flatten(self) -> Rhs: # If it's a single parenthesized group, flatten it. rhs = self.rhs if ( not self.is_loop() and len(rhs.alts) == 1 and len(rhs.alts[0].items) == 1 and isinstance(rhs.alts[0].items[0].item, Group) ): rhs = rhs.alts[0].items[0].item.rhs return rhs
Converted (note that I had to name the classes Alt and NamedItem, which are anonymous in the original):
def flatten(self) -> Rhs: # If it's a single parenthesized group, flatten it. rhs = self.rhs if not self.is_loop(): match rhs.alts: case [Alt(items=[NamedItem(item=Group(rhs=r))])]: rhs = r return rhs
I think part of it is just that I tend to find anything that has 4+ layers deep of nested parentheses and/or brackets to be a bit difficult to mentally parse, but my reaction to seeing something like "case [Alt(items=[NamedItem(item=Group(rhs=r))])]:" in the wild without anything to compare it to would probably be o_0. I definitely find the 4-part conditional in the "Original" version to be a lot easier to quickly understand, even if it's a bit redundant and requires some guard checks. So IMHO, that specific example isn't particularly convincing. That being said, I found the other 5 examples to be very easy to understand, with the second one being the one to really win me over. The proposed class matching is a drastic improvement over a massive wall of "if/elif isinstance(...):" conditionals, and I really like the way it lines up visually with the constants. Also, in time, I could very well change my mind about the last example after getting more used to the proposed syntax. On Tue, Jun 23, 2020 at 12:04 PM Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala](http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#](https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post](https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python...) by Tobias Kohn; - An independently developed [draft PEP](https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code](https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP](https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
On 6/25/2020 3:55 AM, Kyle Stanley wrote:
2) Regarding the constant value pattern semantics, I'm okay with the usage of the "." in general, but I completely agree with several others that it's rather difficult to read when there's a leading period with a single word, e.g. ".CONSTANT". To some degree, this could probably be less problematic with some reasonably good syntax highlighting to draw attention to the leading period.
However, I don't think it should be at all necessary for people to rely on syntax highlighting to be able to clearly see something that's part of a core Python language feature. It seems especially detrimental for those with visual impairment. As someone with relatively poor eye-sight who typically has to blow up the font size for my code to be readable (and often finds syntax highlighting to be distracting), I'm not really looking forward to squinting for missed leading periods when it was intended to refer to a constant reference. Even if it's a relatively uncommon case, with a core feature, it's bound to happen enough to cause some headaches.
A missing . is exactly the type of mistake I tend to make. It is also the type of mistake that I could stare at endlessly and not notice. Surely there could be a much more obvious way of doing things. Other than this . issue, the PEP is great! I look forward to using match. --Edwin Zimmerman
Some proof-reading of the PEP. I apologise if this is out of date. 1) In the beginning of the "Mapping Pattern" section: "{" (pattern ":" pattern)+ "}" This is spelt inconsistently: there is a `+` before the closing `}` but not after the opening `{`. 2) The second code snippet in the "Guards" section: values = [0] match value: case [x] if x: ... # This is not executed case _: ... print(x) # This will print "0" Inconsistent spelling: `values` and `value` 3) At the end of the "Named sub-patterns" section: "PEP 572" It would be more helpful to say "PEP 572 (Assignment Expressions)" Rob Cliffe
On Fri, Jun 26, 2020 at 12:42 AM Rob Cliffe via Python-Dev < python-dev@python.org> wrote:
1) In the beginning of the "Mapping Pattern" section: "{" (pattern ":" pattern)+ "}" This is spelt inconsistently: there is a `+` before the closing `}` but not after the opening `{`.
That grammar is more like a regex: + means "accept one or more of the previous" here. Maybe grammar inserts should be their own subsections to avoid confusion, or at least highlighted, instead of mixing into the text.
In this example code from the PEP: match shape: case Point(x, y): ... case Rectangle(x0, y0, x1, y1, painted=True): What is the "painted=True" portion doing? Is it requiring that the painted attribute of the shape object be True in order to match? -- ~Ethan~
Just a minor editorial thing on the PEP text: The section https://www.python.org/dev/peps/pep-0622/#case-clauses presents a simplified syntax. That one mentions "group_pattern", but the document never mentions (in prose) what a group pattern is. It confused me until I found the definition in the full grammar, which seems to refer to those sequence patterns using () rather than []. Probably it makes more sense for a quick read to remove the "| group_pattern" from the simplified grammar, it looks more like an intermediate construct. On Tue, 23 Jun 2020 at 17:04, Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
Yes, my brain went through the same path. Another minor nitpick: It would be kinda nice if the various types of pattern were listed in the grammar in the same order as the corresponding paragraphs subsequently appear. On 26/06/2020 11:53, Daniel Moisset wrote:
Just a minor editorial thing on the PEP text:
The section https://www.python.org/dev/peps/pep-0622/#case-clauses presents a simplified syntax. That one mentions "group_pattern", but the document never mentions (in prose) what a group pattern is. It confused me until I found the definition in the full grammar, which seems to refer to those sequence patterns using () rather than []. Probably it makes more sense for a quick read to remove the "| group_pattern" from the simplified grammar, it looks more like an intermediate construct.
Hi, For a PEP to succeed it needs to show two things. 1. Exactly what problem is being solved, or need is to be fulfilled, and that is a sufficiently large problem, or need, to merit the proposed change. 2. That the proposed change is the best known solution for the problem being addressed. IMO, PEP 622 fails on both counts. This email addresses point 1. Given the positive response to the PEP, it may well be that it does address a need. However, the PEP itself fails to show that. Abstract --------
This PEP proposes adding pattern matching statements [1] to Python in order to create more expressive ways of handling structured heterogeneous data. The authors take a holistic approach, providing both static and runtime specifications.
What does "static and dynamic specifications" mean? Surely, there are just specifications. Python does not have a static checking phase, so static analysis tools need to understand the dynamic behaviour of the program, not have their own alternate semantics. There is no "static specification" of `isinstance()`, yet static analysis tools understand it.
PEP 275 and PEP 3103 previously proposed similar constructs, and were rejected. Instead of targeting the optimization of if ... elif ... else statements (as those PEPs did), this design focuses on generalizing sequence, mapping, and object destructuring. It uses syntactic features made possible by PEP 617, which introduced a more powerful method of parsing Python source code.
Why couple the choice part (a sort of enhanced elif) with destructing (a sort of enhanced unpacking)? We could have a "switch" statement that chooses according to value, and we could have "destructuring" that pulls values apart. Why do they need to be coupled? Rationale and Goals -------------------
Let us start from some anecdotal evidence: isinstance() is one of the most called functions in large scale Python code-bases (by static call count). In particular, when analyzing some multi-million line production code base, it was discovered that isinstance() is the second most called builtin function (after len()). Even taking into account builtin classes, it is still in the top ten. Most of such calls are followed by specific attribute access.
Why use anecdotal evidence? I don't doubt the numbers, but it would be better to use the standard library, or the top N most popular packages from GitHub.
There are two possible conclusions that can be drawn from this information:
Handling of heterogeneous data (i.e. situations where a variable can take values of multiple types) is common in real world code. Python doesn't have expressive ways of destructuring object data (i.e. separating the content of an object into multiple variables).
I don't see how the second conclusion can be drawn. How does the prevalence of `isinstance()` suggest that Python doesn't have expressive ways of destructuring object data? That `len()` is also common, does suggests that some more expressive unpacking syntax might be useful. However, since `len()` only applies to sequences, it suggests to me that unpacking of non-sequences isn't generally useful.
This is in contrast with the opposite sides of both aspects:
This sentence makes no sense. What is "this"? Both aspects of what?
Its success in the numeric world indicates that Python is good when working with homogeneous data. It also has builtin support for homogeneous data structures such as e.g. lists and arrays, and semantic constructs such as iterators and generators. Python is expressive and flexible at constructing objects. It has syntactic support for collection literals and comprehensions. Custom objects can be created using positional and keyword calls that are customized by special __init__() method.
This PEP aims at improving the support for destructuring heterogeneous data by adding a dedicated syntactic support for it in the form of pattern matching. On a very high level it is similar to regular expressions, but instead of matching strings, it will be possible to match arbitrary Python objects.
An explanation is needed of why "destructuring" needs to be so tightly coupled with matching by class or value.
We believe this will improve both readability and reliability of relevant code. To illustrate the readability improvement, let us consider an actual example from the Python standard library:
def is_tuple(node): if isinstance(node, Node) and node.children == [LParen(), RParen()]: return True return (isinstance(node, Node) and len(node.children) == 3 and isinstance(node.children[0], Leaf) and isinstance(node.children[1], Node) and isinstance(node.children[2], Leaf) and node.children[0].value == "(" and node.children[2].value == ")")
Just one example? The PEP needs to show that this sort of pattern is widespread.
With the syntax proposed in this PEP it can be rewritten as below. Note that the proposed code will work without any modifications to the definition of Node and other classes here:
Without modifying Node or Leaf, the matching code will need to access attributes. You should at least mention side effects and exceptions. E.g. matching on ORM objects might be problematic.
def is_tuple(node: Node) -> bool: match node: case Node(children=[LParen(), RParen()]): return True case Node(children=[Leaf(value="("), Node(), Leaf(value=")")]): return True case _: return False
Python's support for OOP provides an alternative to ADTs. For example, by adding a simple "matches" method to Node and Leaf, `is_tuple` can be rewritten as something like: def is_tuple(node): if not isinstance(node, Node): return False return node.matches("(", ")") or node.matches("(", ..., ")")
See the syntax sections below for a more detailed specification.
Similarly to how constructing objects can be customized by a user-defined __init__() method, we propose that destructuring objects can be customized by a new special __match__() method. As part of this PEP we specify the general __match__() API, its implementation for object.__match__(), and for some standard library classes (including PEP 557 dataclasses). See runtime section below.
You should mention that we already have the ability to "destructure", aka unpack, objects using __iter__. t = 1, 2 # Creation a, b = t # "Destructuring"
Finally, we aim to provide a comprehensive support for static type checkers and similar tools. For this purpose we propose to introduce a @typing.sealed class decorator that will be a no-op at runtime, but will indicate to static tools that all subclasses of this class must be defined in the same module. This will allow effective static exhaustiveness checks, and together with dataclasses, will provide a nice support for algebraic data types [2]. See the static checkers section for more details.
Shouldn't this be in a separate PEP? It seems only loosely related, and would have some value regardless of whether the rest of the PEP is accepted.
In general, we believe that pattern matching has been proved to be a useful and expressive tool in various modern languages. In particular, many aspects of this PEP were inspired by how pattern matching works in Rust [3] and Scala [4].
Both those languages are statically typed, which allows the compiler to perform the much of the pattern matching at compile time. You should give examples from dynamic typed languages instead, e.g. clojure. Cheers, Mark.
On Fri, Jun 26, 2020 at 3:38 PM Mark Shannon <mark@hotpy.org> wrote:
What does "static and dynamic specifications" mean? Surely, there are just specifications.
There are specifications for both the runtime and the static aspects of the Python programming language.
Python does not have a static checking phase,
The (C)Python *interpreter* doesn't. Other Python implementations (existing or hypothetical) may or may not have a static checking phase. But static tools need specifications beyond (ie. additionally to) runtime specifications, which are defined in PEP 483, 484, 585 and others.
Let us start from some anecdotal evidence: isinstance() is one of the most called functions in large scale Python code-bases (by static call count). In particular, when analyzing some multi-million line production code base, it was discovered that isinstance() is the second most called builtin function (after len()). Even taking into account builtin classes, it is still in the top ten. Most of such calls are followed by specific attribute access.
Why use anecdotal evidence? I don't doubt the numbers, but it would be better to use the standard library, or the top N most popular packages from GitHub.
Maybe a scientific paper could be written on this subject. I'm guessing the "multi-million line production code base" in question is the Dropbox code base, and maybe Dropbox has an idiomatic way of writing Python with lots of "isinstance()"s.
In general, we believe that pattern matching has been proved to be a useful and expressive tool in various modern languages. In particular, many aspects of this PEP were inspired by how pattern matching works in Rust [3] and Scala [4].
Both those languages are statically typed, which allows the compiler to perform the much of the pattern matching at compile time.
You should give examples from dynamic typed languages instead, e.g. clojure.
Here's one example: https://github.com/clojure/core.match (in particular: https://github.com/clojure/core.match/wiki/Understanding-the-algorithm ). Alson some insights from https://softwareengineering.stackexchange.com/questions/237023/pattern-match... In this video <https://www.youtube.com/watch?v=dGVqrGmwOAw> I watched recently, Rich Hickey comments that he likes the destructuring part of languages like Scala, but not so much the pattern matching part, and he designed Clojure accordingly. That probably explains why the pattern matching is in a library and not as robust, although the kind of problems seen in the post you mentioned are clearly bugs. What Rich Hickey mentions as an alternative to pattern matching is multimethods <http://clojure.org/multimethods>. Most languages let you do polymorphic dispatch based on type. Some languages let you also do it based on a value. Using multimethods, Clojure lets you do it based on any arbitrary function. That's a pretty powerful concept. It comes down to the principle that programmers using a language should use the language's own best idioms. Trying to write Scala-like code in Clojure is going to have its difficulties, and vice versa. S. -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyParis & PyData Paris - http://pyparis.org/ & http://pydata.fr/
On 2020-06-26 16:54, Stéfane Fermigier wrote: [...]
Here's one example:
https://github.com/clojure/core.match (in particular: https://github.com/clojure/core.match/wiki/Understanding-the-algorithm ).
Alson some insights from https://softwareengineering.stackexchange.com/questions/237023/pattern-match...
In this video <https://www.youtube.com/watch?v=dGVqrGmwOAw> I watched recently, Rich Hickey comments that he likes the destructuring part of languages like Scala, but not so much the pattern matching part, and he designed Clojure accordingly. That probably explains why the pattern matching is in a library and not as robust, although the kind of problems seen in the post you mentioned are clearly bugs.
What Rich Hickey mentions as an alternative to pattern matching is multimethods <http://clojure.org/multimethods>. Most languages let you do polymorphic dispatch based on type. Some languages let you also do it based on a value. Using multimethods, Clojure lets you do it based on any arbitrary function. That's a pretty powerful concept.
It comes down to the principle that programmers using a language should use the language's own best idioms. Trying to write Scala-like code in Clojure is going to have its difficulties, and vice versa.
It does look like the PEP tries to do two different things: "switch" instead of if/elif, and destructuring. Would it be useful to introduce an operator for "isinstance", if it's so commonly used? Are the calls to it (in the relevant codebase) actually used in complex code that needs destructuring, or could we live with this (IS_A being a placeholder for bikeshedding, of course): if shape IS_A Point: x, y = shape ... elif shape IS_A Rectangle: x, y, w, h = shape ... elif shape IS_A Line: x, y = line.start if line.start == line.end: print(f"Zero length line at {x}, {y}") or: queue: Union[Queue[int], Queue[str]] if queue IS_A IntQueue: # Type-checker detects unreachable code ... There aren't many convincing examples for destructuring in the PEP, IMO. The "mapping pattern" one could be rewritten as: if route := config.get('route'): process_route(route) if subconfig := config.pop(constants.DEFAULT_PORT): process_config(sub_config, config) Sequence destructuring examples ([_] for "short sequence") don't seem too useful. Would they actually improve lots of existing code? Complex object /tree destructuring (like the is_tuple) is painful in Python, but then again, the new syntax also becomes quite inscrutable for complex cases. Is code like the is_tuple example in the Rationale actually common? The "Sealed classes as algebraic data types" example looks like a good candidate for a dump() method or PEP 443 single dispatch, both of which should be amenable to static analysis.
On Fri, Jun 26, 2020 at 6:42 AM Mark Shannon <mark@hotpy.org> wrote:
Let us start from some anecdotal evidence: isinstance() is one of the most called functions in large scale Python code-bases (by static call count). In particular, when analyzing some multi-million line production code base, it was discovered that isinstance() is the second most called builtin function (after len()). Even taking into account builtin classes, it is still in the top ten. Most of such calls are followed by specific attribute access.
Why use anecdotal evidence? I don't doubt the numbers, but it would be better to use the standard library, or the top N most popular packages from GitHub.
Agreed. This anecdote felt off to me and made for a bad introductory feeling. I know enough of who is involved to read it as likely "within the internal Dropbox code base we found isinstance() to be the second most called built-in by static call counts". It'd be better worded as such instead of left opaque if you are going to use this example at all. [but read on below, i'm not sure the anecdotal evidence is even relevant to state] Also if using this, please include text explaining what "static call count means". Was that "number of grep 'isinstance[(]' matches in all .py files which we reasonably assume are calls"? Or was that "measuring a running application and counting cumulative calls of every built-in for the lifetime of the large application"? Include a footnote of if you have you removed all use of six and py2->py3-isms? Both six and manual py2->3 porting often wound up adding isinstance in places where they'll rightfully be refactored out when cleaning up the py2 dead code legacy becomes anyones priority. A very rough grep of our much larger Python codebase within Google shows isinstance *call site counts* to likely be lower than int or len and similar to print. With a notable percentage of isinstance usage clearly related to py2 -> py3 compatibility, suggesting many can now go away. I'm not going to spend much time looking further as I don't think actual numbers matter: *Confirmed, isinstance gets used a lot.* We can simply state that as a truth and move on without needing a lot of justification.
There are two possible conclusions that can be drawn from this information:
Handling of heterogeneous data (i.e. situations where a variable can take values of multiple types) is common in real world code. Python doesn't have expressive ways of destructuring object data (i.e. separating the content of an object into multiple variables).
I don't see how the second conclusion can be drawn. How does the prevalence of `isinstance()` suggest that Python doesn't have expressive ways of destructuring object data?
...
We believe this will improve both readability and reliability of
relevant code. To illustrate the readability improvement, let us consider an actual example from the Python standard library:
def is_tuple(node): if isinstance(node, Node) and node.children == [LParen(), RParen()]: return True return (isinstance(node, Node) and len(node.children) == 3 and isinstance(node.children[0], Leaf) and isinstance(node.children[1], Node) and isinstance(node.children[2], Leaf) and node.children[0].value == "(" and node.children[2].value == ")")
Just one example? The PEP needs to show that this sort of pattern is widespread.
Agreed. I don't find application code following this pattern to be common. Yes it exists, but I would not expect to encounter it frequently if I were doing random people's Python code reviews. The supplied "stdlib" code example is lib2to3.fixer_util.is_tuple. Using that as an example of code "in the standard library" is *technically* correct. But lib2to3 is an undocumented deprecated library that we have slated for removal. That makes it a bit weak to cite. Better practical examples don't have to be within the stdlib. Randomly perusing some projects I know that I expect to have such constructs, here's a possible example: https://github.com/PyCQA/pylint/blob/master/pylint/checkers/logging.py#L231. There are also code patterns in pytype such as https://github.com/google/pytype/blob/master/pytype/vm.py#L480 and https://github.com/google/pytype/blob/master/pytype/vm.py#L1088 that might make sense. Though I realize you were probably in search of a simple one for the PEP in order to write a before and after example. -gps
I've been going over the PEP this weekend, trying to get a deeper understanding of what are its main ideas and consequences, and wrote some notes. I'm not posting the notes directly to this list because it's a bit of a long read, but I also tried to make it helpful as an analysis for people involved in the discussion. So here's a link: https://github.com/dmoisset/notebook/blob/811side of thingsbf66/python/pep622/understanding-pep-622.md <https://github.com/dmoisset/notebook/blob/811bf66/python/pep622/understanding-pep-622.md> . I may update this in master, but for clarity I'm permalinking the current version. I'll soon switch to "proposing solutions" mode (rather than "analysis mode" as this text is) soon, but needed to do this first, and hopefully this helps someone else in this list organise ideas. Best, D. On Tue, 23 Jun 2020 at 17:04, Guido van Rossum <guido@python.org> wrote:
I'm happy to present a new PEP for the python-dev community to review. This is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern matching similar to that found in Scala, Rust, F#, Haskell and other languages with a functional flavor. The topic has come up regularly on python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself: - Published: https://www.python.org/dev/peps/pep-0622/ (*) - Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is enormous. For many key decisions the authors have clashed, in some cases we have gone back and forth several times, and a few uncomfortable compromises were struck. It is quite possible that some major design decisions will have to be revisited before this PEP can be accepted. Nevertheless, we're happy with the current proposal, and we have provided ample discussion in the PEP under the headings of Rejected Ideas and Deferred Ideas. Please read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've worked on the draft, which is shorter and gives a gentler introduction than the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins -------
The work has several origins:
- Many statically compiled languages (especially functional ones) have a `match` expression, for example [Scala]( http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html ), [Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html), [F#]( https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... ); - Several extensive discussions on python-ideas, culminating in a summarizing [blog post]( https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-python... ) by Tobias Kohn; - An independently developed [draft PEP]( https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst) by Ivan Levkivskyi.
Implementation --------------
A full reference implementation written by Brandt Bucher is available as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of the CPython repo. This is readily converted to a [pull request](https://github.com/brandtbucher/cpython/pull/2)).
Examples --------
Some [example code]( https://github.com/gvanrossum/patma/tree/master/examples/) is available from this repo.
Tutorial --------
A `match` statement takes an expression and compares it to successive patterns given as one or more `case` blocks. This is superficially similar to a `switch` statement in C, Java or JavaScript (an many other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py def http_error(status): match status: case 400: return "Bad request" case 401: return "Unauthorized" case 403: return "Forbidden" case 404: return "Not found" case 418: return "I'm a teapot" case _: return "Something else" ```
Note the last block: the "variable name" `_` acts as a *wildcard* and never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py case 401|403|404: return "Not allowed" ```
Patterns can look like unpacking assignments, and can be used to bind variables:
```py # The target is an (x, y) tuple match point: case (0, 0): print("Origin") case (0, y): print(f"Y={y}") case (x, 0): print(f"X={x}") case (x, y): print(f"X={x}, Y={y}") case _: raise ValueError("Not a point") ```
Study that one carefully! The first pattern has two literals, and can be thought of as an extension of the literal pattern shown above. But the next two patterns combine a literal and a variable, and the variable is *extracted* from the target value (`point`). The fourth pattern is a double extraction, which makes it conceptually similar to the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes) you can use the class name followed by an argument list resembling a constructor, but with the ability to extract variables:
```py from dataclasses import dataclass
@dataclass class Point: x: int y: int
def whereis(point): match point: case Point(0, 0): print("Origin") case Point(0, y): print(f"Y={y}") case Point(x, 0): print(f"X={x}") case Point(): print("Somewhere else") case _: print("Not a point") ```
We can use keyword parameters too. The following patterns are all equivalent (and all bind the `y` attribute to the `var` variable):
```py Point(1, var) Point(1, y=var) Point(x=1, y=var) Point(y=var, x=1) ```
Patterns can be arbitrarily nested. For example, if we have a short list of points, we could match it like this:
```py match points: case []: print("No points") case [Point(0, 0)]: print("The origin") case [Point(x, y)]: print(f"Single point {x}, {y}") case [Point(0, y1), Point(0, y2)]: print(f"Two on the Y axis at {y1}, {y2}") case _: print("Something else") ```
We can add an `if` clause to a pattern, known as a "guard". If the guard is false, `match` goes on to try the next `case` block. Note that variable extraction happens before the guard is evaluated:
```py match point: case Point(x, y) if x == y: print(f"Y=X at {x}") case Point(x, y): print(f"Not on the diagonal") ```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the same meaning and actually match arbitrary sequences. An important exception is that they don't match iterators or strings. (Technically, the target must be an instance of `collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y, *rest)` work similar to wildcards in unpacking assignments. The name after `*` may also be `_`, so `(x, y, *_)` matches a sequence of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the `"bandwidth"` and `"latency"` values from a dict. Unlike sequence patterns, extra keys are ignored. A wildcard `**rest` is also supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py case (Point(x1, y1), p2 := Point(x2, y2)): ... ```
- Patterns may use named constants. These must be dotted names; a single name can be made into a constant value by prefixing it with a dot to prevent it from being interpreted as a variable extraction:
```py RED, GREEN, BLUE = 0, 1, 2
match color: case .RED: print("I see red!") case .GREEN: print("Grass is green") case .BLUE: print("I'm feeling the blues :(") ```
- Classes can customize how they are matched by defining a `__match__()` method. Read the [PEP]( https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificatio...) for details.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RFW56R7L... Code of Conduct: http://python.org/psf/codeofconduct/
On 28/06/2020 21:47, Daniel Moisset wrote:
I've been going over the PEP this weekend, trying to get a deeper understanding of what are its main ideas and consequences, and wrote some notes. I'm not posting the notes directly to this list because it's a bit of a long read, but I also tried to make it helpful as an analysis for people involved in the discussion. So here's a link: https://github.com/dmoisset/notebook/blob/811side of thingsbf66/python/pep622/understanding-pep-622.md <https://github.com/dmoisset/notebook/blob/811bf66/python/pep622/understanding-pep-622.md> . I may update this in master, but for clarity I'm permalinking the current version.
I'll soon switch to "proposing solutions" mode (rather than "analysis mode" as this text is) soon, but needed to do this first, and hopefully this helps someone else in this list organise ideas.
Thank you for that, Daniel. That's a very nice analysis that makes my own misgivings clearer and put some of them to rest. I think you are right that generalised destructuring is probably the thing to concentrate on; once we have something cohesive there, pattern syntax should become a lot more obvious. -- Rhodri James *-* Kynesim Ltd
On 29/06/20 8:47 am, Daniel Moisset wrote:
<https://github.com/dmoisset/notebook/blob/811bf66/python/pep622/understanding-pep-622.md> .
You seem to be trying to shoehorn all Python data structures into looking like alebraic types, for the sole purpose of being able to claim that PEP 622 is really about algebraic types rather than pattern matching. I don't think that's a helpful way of looking at things. Pattern matching with destructuring is a more general concept. Algebraic types is just one of its applications. I think your viewpoint is coloured by languages in which algebraic types play a much more central role than they do in Python. For example, in Haskell, the usual notation for lists is syntactic sugar for an algebraic type representing a linked list. But Python lists are not linked lists, they're flexible-sized arrays, and you have to squint very hard indeed to see them as being fundamentally an algebraic type. Yet pattern matching on them makes perfectly good sense.
returning by default an object __dict__ or some sort of mapping view on the attributes could still be fine. It's not clear to me why the "keys" of this structure are placed separately.
For me, there should be an instance method in object (that subclasses can override) that returns the algebraic structure of the value. The PEP as-is creates different destructuring views depending on which matching class you use (whicch I think relates to something that was mentioned but not discussed a lot in the python-dev list about Liskov sustitability). I think the PEP has this right. Liskov substitutability doesn't apply to constructors -- they're not methods, and the constructor of a subclass doesn't have to accept the same arguments as that of its
I think the PEP explains the rationale behind the design of the matching protocol quite well. The goal is to make it as simple as possible to implement in the most common cases. base class. The same thing applies to deconstructors, since they have to mirror the signature of their corresponding constructors. For example, consider class Oval: def __init__(self, centre, width, height): ... class Circle(Oval): def __init__(self, centre, radius): ... match shape: case Oval(c, w, h): ... If shape happens to be a Circle, you still want to deconstruct it as an Oval and get centre, width, height, not centre, radius. There's no way that can happen if the object itself is responsible for its deconstruction. (Incidentally, I do think the post that mentioned Liskov substitutability has a point, but it's a different one -- the default single-positional-argument deconstruction is probably a bad idea, because it will be wrong for a large number of existing classes.) -- Greg
Hi, thank you for the comments On Tue, 30 Jun 2020 at 07:18, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 29/06/20 8:47 am, Daniel Moisset wrote:
< https://github.com/dmoisset/notebook/blob/811bf66/python/pep622/understanding-pep-622.md> .
You seem to be trying to shoehorn all Python data structures into looking like alebraic types, for the sole purpose of being able to claim that PEP 622 is really about algebraic types rather than pattern matching.
There may be a bit of this, I like "unifying concepts". But if I have a bias, I was heavily pushed by the writing on the PEP. Their inspirations are explicitly Rust and Scala (which have a very strong "algebraic type" core),, and in discussions I've seen the authors discuss F# and Haskell (again with a strong algebraic type influence). If they started on "we were inspired by C's and Javascript's switch statement and then added some extra features" I would have a different vision that the focus is multiple-choice-conditional and the rest are extras. If they mentioned Javascript destructuring operations as inspiration I would think instead that the focus is decomposing builtin types and the rest are extras. The motivation starts discussing about isinstance() checks and extracting attributes which sounds more like "we added this to have algebraic data types and hey, now that we're here we can also include some unpacking and have a switch statement too". The goal of my notes was to read the PEP between lines, so there's some personal guess and bias, but it's not out of the blue :) I don't think that's a helpful way of looking at things. Pattern
matching with destructuring is a more general concept. Algebraic types is just one of its applications.
I agree on this statement (except the first sentence :) )... what I'm trying to say is that the PEP has some underlying algebraic type style and making it explicit is a way to understand it with different eyes.
I think your viewpoint is coloured by languages in which algebraic types play a much more central role than they do in Python. For example, in Haskell, the usual notation for lists is syntactic sugar for an algebraic type representing a linked list.
Haskell also does some shoe-horning... integers in haskell are supposed to be an algebraic type made by the union of infinite constructors named "1", "2", "3", ... :) Even if the implementation is nothing like that, this kind of shoe-horning is useful allows you to have a coherent story and design, so I'm looking for somehting close to that in Python.
But Python lists are not linked lists, they're flexible-sized arrays, and you have to squint very hard indeed to see them as being fundamentally an algebraic type. Yet pattern matching on them makes perfectly good sense.
True. And python has already had that for ages. I'm *guessing* intent here again, but I believe that was included into the PEP because it was easy, not because it was the main concern to address.
returning by default an object __dict__ or some sort of mapping view
on the attributes could still be fine. It's not clear to me why the "keys" of this structure are placed separately.
I think the PEP explains the rationale behind the design of the matching protocol quite well. The goal is to make it as simple as possible to implement in the most common cases.
I have improved my understanding of this. I still find the protocol weak (but mostly the match, not the matched_args) even for the cases that are desired to be covered, but I'm already discussing those directly with the authors.
subclasses can override) that returns the algebraic structure of the value. The PEP as-is creates different destructuring views depending on which matching class you use (whicch I think relates to something that was mentioned but not discussed a lot in the python-dev list about Liskov sustitability). I think the PEP has this right. Liskov substitutability doesn't apply to constructors -- they're not methods, and the constructor of a subclass doesn't have to accept the same arguments as that of its
For me, there should be an instance method in object (that base class. The same thing applies to deconstructors, since they have to mirror the signature of their corresponding constructors. (...)
You're right, it was not Liskov related, but the single argument default behaviour. I was wrong about this.
participants (45)
-
Ambient Nuance
-
Antoine Pitrou
-
Arthur Darcet
-
Barry Warsaw
-
Brandt Bucher
-
Brett Cannon
-
Chris Angelico
-
Chris Jerdonek
-
Daniel Moisset
-
Daniel.
-
David Mertz
-
Edwin Zimmerman
-
Emily Bowman
-
Eric Wieser
-
Ethan Furman
-
Glenn Linderman
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Jakub Stasiak
-
jakub@stasiak.at
-
Jelle Zijlstra
-
Jim F.Hilliard
-
Kyle Stanley
-
Luciano Ramalho
-
M.-A. Lemburg
-
Mark Shannon
-
MRAB
-
Ned Deily
-
Neil Girdhar
-
Nick Coghlan
-
Paul Moore
-
Paul Svensson
-
Petr Viktorin
-
pylang
-
Rhodri James
-
Richard Damon
-
Richard Levasseur
-
Rob Cliffe
-
Stephen J. Turnbull
-
Stéfane Fermigier
-
Taine Zhao
-
Taine Zhao
-
Terry Reedy
-
Tim Peters