PEP 622 version 2 (Structural Pattern Matching)
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used. That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain. Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted. Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here. In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment. However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it. The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases. This discussion is not new to the authors: before we ever published the first version of the PEP we vigorously debated this (it is Issue 1 in our tracker!), and other languages before us have also had to come to grips with it. Even many statically compiled languages! The reason is that for reasons of usability it’s usually deemed important that their equivalent of `case` auto-declare the captured variables, and variable declarations may hide (override) like-named variables in outer scopes. Scala, for example, uses several different rules: first, capture variable names must start with a lowercase letter (so it would handle the above example as intended); next, capture variables cannot be dotted names (like `mod.var`); finally, you can enclose any variable in backticks to force the compiler to see it as a load instead of a store. Elixir uses another form of markup for loads: `x` is a capture variable, but `^x` loads and compares the value of `x`. There are a number of dead ends when looking for a solution that works for Python. Checking at runtime whether a name is defined or not is one of these: there are numerous reasons why this could be confusing, not the least of which being that the `match` may be executed in a loop and the variable may already be bound by a previous iteration. (True, this has to do with the scope we’ve adopted for capture variables. But believe me, giving each case clause its own scope is quite horrible by itself, and there are other action-at-a-distance effects that are equally bad.) It’s been proposed to explicitly state the names of the variables bound in a header of the `match` statement; but this doesn’t scale when the number of cases becomes larger, and requires users to do bookkeeping the compiler should be able to do. We’re really looking for a solution that tells you when you’re looking at an individual `case` which variables are captured and which are used for load-and-compare. Marking up the capture variables with some sigil (e.g. `$x` or `x?`) or other markup (e.g. backticks or `<x>`) makes this common case ugly and inconsistent: it’s unpleasant to see for example ``` case %x, %y: print(x, y) ``` No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`. We believe it’s acceptable that things looking like `mod.var` are never considered capture variables -- the common use cases for `match` are such that one would almost never want to capture into a different namespace. (Just like you very rarely see `for self.i in …` and never `except E as scope.var` -- the latter is illegal syntax and sets a precedent.) One author would dearly have seen Scala’s uppercase rule adopted, but in the end was convinced by the others that this was a bad idea, both because there’s no precedent in Python’s syntax, and because many human languages simply don’t make the distinction between lowercase and uppercase in their writing systems. So what should you do if you have a local variable (say, a function argument) that you want to use as a value in a pattern? One solution is to capture the value in another variable and use a guard to compare that variable to the argument: ``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ``` If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case. There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question. Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns. If you've made it so far, here are the links to check out, with an open mind. As a reminder, the introductory sections (Abstract, Overview, and Rationale and Goals) have been entirely rewritten and also serve as introduction and tutorial. - PEP 622: https://www.python.org/dev/peps/pep-0622/ - Playground: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playgrou... -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Thanks for explaining the open issues so well, Guido. My 2¢ on the `else` matter: On Wed, Jul 8, 2020 at 12:05 PM Guido van Rossum <guido@python.org> wrote:
[...] it’s unclear whether the `else` should be aligned with `case` or `match`.
I strongly favor the option of aligning the `else` clause with `match`, because `else` is a special clause therefore it should look special. "Designers need to ensure that controls and displays for different purposes are significantly different from one another." —Donald Norman, The Design of Everyday Things As I first read a `match` statement, and I see an `else` clause, I know for sure that *something* will happen. If no `else` clause is present, I know it's possible nothing will happen. It's the same thing with `else` in `if`, `while`, `for`, `try` statements, where the `else` is aligned with the opening keyword. Cheers, Luciano -- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg
On 07/08/2020 08:30 AM, Luciano Ramalho wrote:
As I first read a `match` statement, and I see an `else` clause, I know for sure that *something* will happen. If no `else` clause is present, I know it's possible nothing will happen. It's the same thing with `else` in `if`, `while`, `for`, `try` statements, where the `else` is aligned with the opening keyword.
If `else` is added, I agree with aligning with `match`, for the same reasons -- especially if we can nest match statements. -- ~Ethan~
On 9/07/20 3:30 am, Luciano Ramalho wrote:
I strongly favor the option of aligning the `else` clause with `match`, because `else` is a special clause therefore it should look special.
But on the other hand, it's semantically equivalent to 'case _:', so it's not all that special. -- Greg
[... snip explanation of key sticking points ...] Thank you for an excellent write-up combining background context with possible solutions. Now I need to actually read the PEP ;) TJG
On 08/07/2020 16:02, Guido van Rossum wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching.
Thank you very much to everyone who has been working on this, it is much appreciated. I have one suggestion for the text: could the section on Capture Patterns emphasise that only simple (i.e not dotted) names are capture patterns? The simplified grammar is (fairly) clear and the later section on Constant Value Patterns should make it obvious, but somehow when reading version 1 I still managed to miss it. I was quite surprised when it was pointed out that case (point.x, point.y): wasn't going to do what I expected! (PS: I'm still pushing for an "else" clause, and I can see arguments for it going at either indentation level. Since putting the clause at the wrong level would be a syntax error, I don't see it being a particularly big issue where it goes.) -- Rhodri James *-* Kynesim Ltd
On Wed, 8 Jul 2020 18:38:12 +0100 Rhodri James <rhodri@kynesim.co.uk> wrote:
On 08/07/2020 16:02, Guido van Rossum wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching.
Thank you very much to everyone who has been working on this, it is much appreciated. I have one suggestion for the text: could the section on Capture Patterns emphasise that only simple (i.e not dotted) names are capture patterns? The simplified grammar is (fairly) clear and the later section on Constant Value Patterns should make it obvious, but somehow when reading version 1 I still managed to miss it. I was quite surprised when it was pointed out that
case (point.x, point.y):
wasn't going to do what I expected!
Why did you expect? It's not clear to me what it should do at all :-) Regards Antoine.
On 08/07/2020 19:08, Antoine Pitrou wrote:
On Wed, 8 Jul 2020 18:38:12 +0100 Rhodri James <rhodri@kynesim.co.uk> wrote:
On 08/07/2020 16:02, Guido van Rossum wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching.
Thank you very much to everyone who has been working on this, it is much appreciated. I have one suggestion for the text: could the section on Capture Patterns emphasise that only simple (i.e not dotted) names are capture patterns? The simplified grammar is (fairly) clear and the later section on Constant Value Patterns should make it obvious, but somehow when reading version 1 I still managed to miss it. I was quite surprised when it was pointed out that
case (point.x, point.y):
wasn't going to do what I expected!
Why did you expect? It's not clear to me what it should do at all :-)
I was expecting it to unpack a 2-tuple(/sequence?) into the x and y attributes of this point object I happened to have in my back pocket (assuming it matched, of course). It actually matches a 2-tuple against the values of those attributes (i.e. they are constant value patterns). It's obvious enough once you have time to read the PEP properly -- I blame only having so many minutes of reading time in my compile/download/test cycle! To use code(ish) examples rather than confusable words, suppose we have: class Point: def __init__(self): self.x = 0 self.y = 0 point = Point() INCOMING = (1, 2) match INCOMING: case (point.x, point.y): print("Point @", point.x, point.y) case _: print("Default") I expected printout of: Point @ 1 2 but I would actually get Default If on the other hand INCOMING was (0, 0), I would get Point @ 0 0 because the first case is in fact the equivalent of case (0, 0): Obviously this is a pointless example (pun intended) because you would use a class pattern if you really wanted to do something like what I first thought of. -- Rhodri James *-* Kynesim Ltd
On Wed, 8 Jul 2020 20:08:34 +0200 Antoine Pitrou <solipsis@pitrou.net> wrote:
On Wed, 8 Jul 2020 18:38:12 +0100 Rhodri James <rhodri@kynesim.co.uk> wrote:
On 08/07/2020 16:02, Guido van Rossum wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching.
Thank you very much to everyone who has been working on this, it is much appreciated. I have one suggestion for the text: could the section on Capture Patterns emphasise that only simple (i.e not dotted) names are capture patterns? The simplified grammar is (fairly) clear and the later section on Constant Value Patterns should make it obvious, but somehow when reading version 1 I still managed to miss it. I was quite surprised when it was pointed out that
case (point.x, point.y):
wasn't going to do what I expected!
Why did you expect? It's not clear to me what it should do at all :-)
Sorry: /what/ did you expect? Regards Antoine.
On 07/08/2020 08:02 AM, Guido van Rossum wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching.
All in all I like it a lot!
As authors we welcome Daniel F Moisset in our midst
Welcom, Daniel, and thank you!
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants,
Excellent!
without a replacement,
So namespaced variables only... is there a recommendation on handling global() and local() type variables? The only thing I didn't see addressed was the concern raised by Pablo:
On 06/25/2020 04:07 PM, Brandt Bucher wrote:
Pablo Galindo Salgado wrote:
...users can do a positional match against the proxy with a name pattern:
match input: case datetime.date(dt): print(f"The date {dt.isoformat()}"
...if 'datetime.date' were updated to implement a non-default __match_args__, allowing individual fields to be pulled out of it like this, then the first block would be valid, correct code before the change, but would raise an ImpossibleMatch after the change because 'dt' is not a field in __match_args__. Is this argument misinterpreting something about the PEP or is missing some important detail?
Well yeah, it's actually a fair bit worse than you describe. Since dt is matched positionally, it wouldn't raise during matching - it would just succeed as before, but instead binding the year attribute (not the whole object) to the name "dt". So it wouldn't fail until later, when your method call raises a TypeError.
Why is this no longer an issue? My apologies if I missed it in the PEP. -- ~Ethan~ P.S. Thanks for all your hard work! I am very much looking forward to using this.
Ethan Furman wrote:
Why is this no longer an issue? My apologies if I missed it in the PEP.
This problem was an artifact of the default `object.__match__` implementation, which allowed one positional argument by default when `__match_args__` was missing or `None`. Since we've removed `__match__` from the proposal (and therefore the default `__match__` implementation from `object`), this issue no longer exists. (Note that most common built-in types like `int` and `tuple` will still work this way, but this behavior is not inherited by *all* objects anymore).
On 07/08/2020 10:44 AM, Ethan Furman wrote:
So namespaced variables only... is there a recommendation on handling global() and local() type variables?
Okay, some off-list discussion clarified that for me: - easiest way is to use a guard
``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ```
If there's a bunch, then SimpleNamespace can be used:
``` def foo(x, spam): L = SimpleNamespace(**locals) match x: case Point(p, q, context=L.spam): # Match ```
So there we have it -- two ways to do it! ;-) -- ~Ethan~
On 09/07/2020 01:27, Ethan Furman wrote:
On 07/08/2020 10:44 AM, Ethan Furman wrote:
So namespaced variables only... is there a recommendation on handling global() and local() type variables?
Okay, some off-list discussion clarified that for me:
- easiest way is to use a guard
``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ```
I like this one. Doesn't it also solve the issue of store vs. load? Everything is stored but the guard clause can look-up.
On 9 Jul 2020, at 17:49, Federico Salerno <salernof11@gmail.com> wrote:
On 09/07/2020 01:27, Ethan Furman wrote:
On 07/08/2020 10:44 AM, Ethan Furman wrote:
So namespaced variables only... is there a recommendation on handling global() and local() type variables?
Okay, some off-list discussion clarified that for me:
- easiest way is to use a guard
``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ```
I like this one. Doesn't it also solve the issue of store vs. load? Everything is stored but the guard clause can look-up.
I have to say I find this to be the most satisfactory solution – everything else (dot previously, no dot now, any other single character hypotheticaly) provides users with IMO too big of a footgun to shoot themselves with. Jakub
One thing I don't understand about the PEP: case [x,y]: IIUC matches any 2-element sequence. How would you match specifically a 2-item list (say)? Would it be case list([x,y]): I would appreciate it if some kind person could enlighten me. TIA Rob Cliffe
Yes, you’ve got that exactly right! I think the pep has an example for tuple((x, y)). On Thu, Jul 9, 2020 at 13:08 Rob Cliffe via Python-Dev < python-dev@python.org> wrote:
One thing I don't understand about the PEP:
case [x,y]:
IIUC matches any 2-element sequence. How would you match specifically a 2-item list (say)? Would it be
case list([x,y]):
I would appreciate it if some kind person could enlighten me. TIA Rob Cliffe _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NJLVOPNA... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido (mobile)
One microscopic point: [Guido]
... (if `.x` is unacceptable, it’s unclear why `^x` would be any better),
As Python's self-appointed spokesperson for the elderly, there's one very clear difference: a leading "." is - literally - one microscopic point, all but invisible. A leading caret is far easier to see, on a variety of devices and using a variety of fonts. Indeed, I missed the leading dot in ".x" in your email the first two times I read that sentence. But a caret is harder to type. So here's an off-the-wall idea: use an ellipsis. If you're still using a maximal-munch lexer, ellipsis followed by an identifier is currently a syntax error. "...x" is far easier to see than ".x', easier to type than "^x", and retains the mnemonic connection that "something is a named load pattern if and only if it has dots". "..." is also a mnemonic for "OK, here I want to match ... umm ... let me think ... I know! A fixed value." ;-)
On Wed, 8 Jul 2020 at 16:05, Guido van Rossum <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ```
Forgive the intrusion, in case this wasn't already mentioned (I only read a fraction of emails on this), we could say that name enclosed in parenthesis would mean loading a constant, instead of storing in a variable: match t: case ((USE_RECT), real, imag): # matches the constant USE_RECT literal value return complex(real, imag) case (USE_POLAR, r, phi): # the USE_POLAR portion matches anything and stores in a USE_POLAR variable return complex(r * cos(phi), r * sin(phi)) Advantages: in Python (and most programming languages), (x) is the same thing as x. So, no new syntax, or weird symbols, need to be introduced. But the parser can distinguish (I hope), and guide the match statement generation to the appropriate behaviour. Yes, it's more typing, but hopefully the case is uncommon enough that the extra typing is not a lot of burden. Yes, it's easy to type USE_RECT when you really meant (USE_RECT). Hopefully linters can catch this case and warn you. OK, that's it, I just thought it was worth throwing yet another idea into the pot :-) -- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
*facepalm* this is right there in the PEP, already, as one possible alternative. Apologies for the noise. :-/ On Wed, 8 Jul 2020 at 19:27, Gustavo Carneiro <gjcarneiro@gmail.com> wrote:
On Wed, 8 Jul 2020 at 16:05, Guido van Rossum <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ```
Forgive the intrusion, in case this wasn't already mentioned (I only read a fraction of emails on this), we could say that name enclosed in parenthesis would mean loading a constant, instead of storing in a variable:
match t: case ((USE_RECT), real, imag): # matches the constant USE_RECT literal value return complex(real, imag) case (USE_POLAR, r, phi): # the USE_POLAR portion matches anything and stores in a USE_POLAR variable return complex(r * cos(phi), r * sin(phi))
Advantages: in Python (and most programming languages), (x) is the same thing as x. So, no new syntax, or weird symbols, need to be introduced.
But the parser can distinguish (I hope), and guide the match statement generation to the appropriate behaviour.
Yes, it's more typing, but hopefully the case is uncommon enough that the extra typing is not a lot of burden.
Yes, it's easy to type USE_RECT when you really meant (USE_RECT). Hopefully linters can catch this case and warn you.
OK, that's it, I just thought it was worth throwing yet another idea into the pot :-)
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
On 08/07/2020 19:27, Gustavo Carneiro wrote:
Forgive the intrusion, in case this wasn't already mentioned (I only read a fraction of emails on this), we could say that name enclosed in parenthesis would mean loading a constant, instead of storing in a variable:
It's discussed as the third bullet point under "Alternatives for constant value pattern": https://www.python.org/dev/peps/pep-0622/#id74 Basically it looks odd and we may need parentheses to manage grouping in patterns. -- Rhodri James *-* Kynesim Ltd
On 08/07/2020 16:02, Guido van Rossum wrote:
``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases.
I apologise for posting a second message re the same idea, but I can't contain my enthusiasm for it:-) and I want to make sure it's not overlooked: *Use '==' to mark* (when necessary) *load-and-compare items*: match t: case (==USE_RECT, real, imag): return complex(real, imag) case (==USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) allowing incidentally a possible future extension to other relational operators: case Point(x, >YMAX): case >= 42:
If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case.
Except that if this idea were taken to its logical conclusion, mod.var would be a capture variable (contrary to the PEP) ==mod.var would be a load-and-compare value Which may be controversial, but seems to have more overall consistency.
On Wed, Jul 8, 2020 at 7:23 PM Rob Cliffe <rob.cliffe@btinternet.com> wrote:
*Use '==' to mark* (when necessary) *load-and-compare items*: match t: case (==USE_RECT, real, imag): return complex(real, imag) case (==USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi))
allowing incidentally a possible future extension to other relational operators: case Point(x, >YMAX): case >= 42:
The problem with this is that value patterns don't just appear at the top level. Consider this example from the PEP's deferred ideas section: case BinaryOp(left=Number(value=x), op=op, right=Number(value=y)): Using your notation, this would become: case BinaryOp(left=Number(value===x), op===op, right=Number(value===y)): The tokenizer, which is eager, would interpret '===' as '==' followed by '=' and it would treat this as a syntax error. Also, it looks a lot like a JavaScript equivalency (?) operator. A single '=' prefix suffers from pretty much the same thing -- Python's tokenizer as well as the tokenizer in most people's heads would read 'x==op' as containing '=='. Please drop it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Just my 2 cents, I find it kind of annoying that the whole structure requires two levels of indentation to actually reach the operational code. This would be a first in python. I would prefer an option akin to if elif elif else where each block is only one level deep. On Wed, 8 Jul 2020 at 16:10, Guido van Rossum <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases.
This discussion is not new to the authors: before we ever published the first version of the PEP we vigorously debated this (it is Issue 1 in our tracker!), and other languages before us have also had to come to grips with it. Even many statically compiled languages! The reason is that for reasons of usability it’s usually deemed important that their equivalent of `case` auto-declare the captured variables, and variable declarations may hide (override) like-named variables in outer scopes.
Scala, for example, uses several different rules: first, capture variable names must start with a lowercase letter (so it would handle the above example as intended); next, capture variables cannot be dotted names (like `mod.var`); finally, you can enclose any variable in backticks to force the compiler to see it as a load instead of a store. Elixir uses another form of markup for loads: `x` is a capture variable, but `^x` loads and compares the value of `x`.
There are a number of dead ends when looking for a solution that works for Python. Checking at runtime whether a name is defined or not is one of these: there are numerous reasons why this could be confusing, not the least of which being that the `match` may be executed in a loop and the variable may already be bound by a previous iteration. (True, this has to do with the scope we’ve adopted for capture variables. But believe me, giving each case clause its own scope is quite horrible by itself, and there are other action-at-a-distance effects that are equally bad.)
It’s been proposed to explicitly state the names of the variables bound in a header of the `match` statement; but this doesn’t scale when the number of cases becomes larger, and requires users to do bookkeeping the compiler should be able to do. We’re really looking for a solution that tells you when you’re looking at an individual `case` which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. `$x` or `x?`) or other markup (e.g. backticks or `<x>`) makes this common case ugly and inconsistent: it’s unpleasant to see for example ``` case %x, %y: print(x, y) ``` No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`.
We believe it’s acceptable that things looking like `mod.var` are never considered capture variables -- the common use cases for `match` are such that one would almost never want to capture into a different namespace. (Just like you very rarely see `for self.i in …` and never `except E as scope.var` -- the latter is illegal syntax and sets a precedent.)
One author would dearly have seen Scala’s uppercase rule adopted, but in the end was convinced by the others that this was a bad idea, both because there’s no precedent in Python’s syntax, and because many human languages simply don’t make the distinction between lowercase and uppercase in their writing systems.
So what should you do if you have a local variable (say, a function argument) that you want to use as a value in a pattern? One solution is to capture the value in another variable and use a guard to compare that variable to the argument: ``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ``` If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case.
There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question.
Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.
If you've made it so far, here are the links to check out, with an open mind. As a reminder, the introductory sections (Abstract, Overview, and Rationale and Goals) have been entirely rewritten and also serve as introduction and tutorial.
- PEP 622: https://www.python.org/dev/peps/pep-0622/ - Playground: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playgrou...
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LOXEATGF... Code of Conduct: http://python.org/psf/codeofconduct/
-- Kind regards, Stefano Borini
On 7/10/2020 1:21 AM, Stefano Borini wrote:
Just my 2 cents, I find it kind of annoying that the whole structure requires two levels of indentation to actually reach the operational code. This would be a first in python.
I would prefer an option akin to if elif elif else where each block is only one level deep. Me too.
That would also sidestep the dilemma of whether else: (if implemented) should be indented like case: or like match: because they would be the same. match: t case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None but it does make the match: block not a statement group, which was disturbing to some. On the other hand, this has a correspondence to: try: throw expression except (type of expression) as exc1: blah blah1 except (another type) as exc2: blah blah2 else: blah blah3 In fact, one _could_ wrap this whole feature into the try: syntax... the match statement would be tried, and the cases would be special types of exception handlers: try: match expression case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None If the expression could fail to be calculated, one could have a mix of except clauses also to catch those, rather than needing to wrap the whole match expression in a separate try to handle that case [making the nesting even deeper :( ] There might even be a use for using case clauses to extend "normal" exception handling, where the exception object could be tested for its content as well as its class to have different handling. try: raise Exception("msg", 35, things) case Exception( x, "widgets"): blah blah 1 case Exception( x, "characters"): blah blah 2 else: blah blah 3 In this not-fully-thought-through scenario, maybe the keyword match isn't even needed: "raise expression" could do the job, or they could be aliases to signify intent. In other words, a match expression would always "fail". The only mismatch here is that it points out the difference between try-else and match-else: try-else is executed if there is no failure, but if match always fails, else would never be appropriate, and case _: would be. In any case, it does seem there is a strong correlation between match processing and try processing, that I didn't see during other discussions of the possible structural similarities. "match 3 / 0:" would clearly need to be wrapped in a try: try: match x / y: case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case _: print("You get what you get") except ZeroDivisionError as exc: print(f"But sometimes you get an exception {exc}") or: try: raise x / y case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case exc := ZeroDivisionError: print(f"But sometimes you get an exception: {exc}") case _: print("You get what you get")
If we are still not certain about the exact language to describe match then I would ask if the 'case' token is really required. It seems that I would prefer match expr: pattern0: block0 pattern1: block1 ..... else: blockdefault where the else: clause is optional. Also for me the unusual case is the assignment to names in the pattern and I would prefer that that be marked in some way; I didn't like .name, but ?name seems OK (or perhaps => name). Also the restriction that assigned vars should only occur once in a pattern seems wrong. I would regard it as an additional constraint on the match, but I do admit I don't fully understand what's allowed in patterns. Please disregard if the above is totally stupid.
A thought about the indentation level of a speculated "else" clause... Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements. However, in those statements, all the actual code belonging to the statement is indented to the same level: if a: .... elif b: .... else: .... ^ | Code all indented to this level But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest. match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement. -- Greg
On Fri, 10 Jul 2020 at 12:08, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A thought about the indentation level of a speculated "else" clause...
Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements.
However, in those statements, all the actual code belonging to the statement is indented to the same level:
if a: .... elif b: .... else: ....
^ | Code all indented to this level
But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest.
match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels
This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement.
That's a good point - and sufficiently compelling that (if "align else with match" ends up being the syntax) I'd always use "case _" rather than else. Equally, of course, it means that aligning else with match gives users a choice of which indentation they prefer: * Align with cases - use "case _" * Align with match - use "else" I've pretty much convinced myself that whatever happens, I'll ignore else and just use "case _" everywhere (and mandate it in projects I work on, where I have control over style). One thought - what will tools like black do in the "case _ vs else" debate? I can foresee some amusing flamewars if linters and formatters end up preferring one form over the other... Paul
On 10/07/2020 12:33, Greg Ewing wrote:
A thought about the indentation level of a speculated "else" clause...
Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements.
However, in those statements, all the actual code belonging to the statement is indented to the same level:
if a: .... elif b: .... else: ....
^ | Code all indented to this level
But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest.
match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels
This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement.
I feel all those who aren't directly arguing against it are working off the assumption that it is needed for match and case to have different levels of indentation, but is this really true? Is there anything (bar tradition or other subjective arguments) that speaks in favour of this, especially in light of the fact that having the same indentation level would also solve other problems? A few emails ago I proposed something like this (and I'm probably only the last one to do so amongst many), but if anyone made an argument against it I must have missed it: match: a case 1: ... case 2: ... else: ... (The a on a separate line being arguable.) I think it would look neater, be reminiscent of the if/elif/else syntax we're all familiar with, and solve the issue of where to indent the else.
Federico Salerno wrote:
Is there anything (bar tradition or other subjective arguments) that speaks in favour of this, especially in light of the fact that having the same indentation level would also solve other problems? ...if anyone made an argument against it I must have missed it:
We spend a fair bit of time discussing this exact proposal in the PEP (I just searched for "indent"): https://www.python.org/dev/peps/pep-0622/#use-a-flat-indentation-scheme
Hello, On Sat, 11 Jul 2020 00:35:39 +0200 Federico Salerno <salernof11@gmail.com> wrote: []
A few emails ago I proposed something like this (and I'm probably only the last one to do so amongst many), but if anyone made an argument against it I must have missed it:
The PEP itself in "rejected" ideas makes an argument against it: indented stuff after a line ending with ":" must be a *statement*. It would be totally nuts for that to be something else, e.g. an expression:
match: a case 1: ... case 2: ... else: ...
(The a on a separate line being arguable.)
That of course leads us to the obvious idea: match a: case 1: ... case 2: ... else: ... Of course, PEP smartly has an argument against that too, in the vein of "after line ending with ':', there should be an indent suite (list of statements)". But that's where it goes sideways. That argument is no better than the argument "there should be no normally looking identifiers with magic behavior", but look, very this PEP does exactly that with the identifier "_". And if the above snippet looks weird to anybody, it's only because of all the "case" business. There wouldn't be such a problem if it was instead: match a: | 1: ... | 2: ... |: ... The above ML-like syntax should be perfect for almost everyone, ... except the PEP authors, because they have it in "rejected ideas" either. -- Best regards, Paul mailto:pmiscml@gmail.com
I did make the following arguments about less indentation in https://github.com/gvanrossum/patma/issues/59 To recap: 1. Similarity to if/elif/else and try/except/finally statements in how code lines up 2. Less apparent complexity, since indentation is a visual signal for such 3. Smaller, more meaningful diffs when refactoring if/elif/else chains Just to be clear, I wanted to capture these as possible objections, I'm not greatly in favor of one indentation scheme or the other - there are good arguments for the indentation scheme of the current PEP (which it makes). - Jim On Fri, Jul 10, 2020 at 5:11 PM Paul Sokolovsky <pmiscml@gmail.com> wrote:
Hello,
On Sat, 11 Jul 2020 00:35:39 +0200 Federico Salerno <salernof11@gmail.com> wrote:
[]
A few emails ago I proposed something like this (and I'm probably only the last one to do so amongst many), but if anyone made an argument against it I must have missed it:
The PEP itself in "rejected" ideas makes an argument against it: indented stuff after a line ending with ":" must be a *statement*. It would be totally nuts for that to be something else, e.g. an expression:
match: a case 1: ... case 2: ... else: ...
(The a on a separate line being arguable.)
That of course leads us to the obvious idea:
match a: case 1: ... case 2: ... else: ...
Of course, PEP smartly has an argument against that too, in the vein of "after line ending with ':', there should be an indent suite (list of statements)". But that's where it goes sideways. That argument is no better than the argument "there should be no normally looking identifiers with magic behavior", but look, very this PEP does exactly that with the identifier "_".
And if the above snippet looks weird to anybody, it's only because of all the "case" business. There wouldn't be such a problem if it was instead:
match a: | 1: ... | 2: ... |: ...
The above ML-like syntax should be perfect for almost everyone, ... except the PEP authors, because they have it in "rejected ideas" either.
-- Best regards, Paul mailto:pmiscml@gmail.com _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7V7BS3IC... Code of Conduct: http://python.org/psf/codeofconduct/
On 11/07/20 10:35 am, Federico Salerno wrote:
I feel all those who aren't directly arguing against it are working off the assumption that it is needed for match and case to have different levels of indentation, but is this really true? Is there anything (bar tradition or other subjective arguments) that speaks in favour of this,
I can't think of one at the moment, but I don't think you should dismiss tradition so easily. One of the arguments used to justify significant indentation in Python is that "you're going to indent it for readability anyway, so the compiler might as well take notice of it". For the most part, Python indentation follows what people would naturally do even if they didn't have to. So I think it's worth looking at what people typically do in other languages that don't have mandatory indentation. Taking C, for example, switch statements are almost always written like this: switch (x) { case 1: ... case 2: ... default: ... } I've rarely if ever seen one written like this: switch (x) { case 1: ... case 2: ... default: ... } or like this: switch (x) { case 1: ... case 2: ... default: ... } This suggests to me that most people think of the cases as being subordinate to the switch, and the default being on the same level as the other cases. -- Greg
Hello, On Sat, 11 Jul 2020 22:49:09 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote: []
For the most part, Python indentation follows what people would naturally do even if they didn't have to. So I think it's worth looking at what people typically do in other languages that don't have mandatory indentation.
Taking C, for example, switch statements are almost always written like this:
switch (x) { case 1: ... case 2: ... default: ... }
I've rarely if ever seen one written like this:
switch (x) { case 1: ... case 2: ... default: ... }
Indeed, that's unheard of (outside of random pupil dirtcode). Actually, the whole argument in PEP 622 regarding "else:", that its placement is ambiguous sounds like a rather artificial write-off. Individual "case"'s are aligned together, but suddenly, it's unclear how to align the default case, introduced by "else"? Who in good faith would align it with "match"?
or like this:
switch (x) { case 1: ... case 2: ... default: ... }
Oh really, you never saw that? Well, they say that any programmer should eyeball the source code of the most popular open-source OS at least once: https://elixir.bootlin.com/linux/latest/source/kernel/sys.c#L2144 And a lot of projects follow the Linux codestyle, because it's familiar to many people and offers ready/easy to use infrastructure for code style control.
This suggests to me that most people think of the cases as being subordinate to the switch, and the default being on the same level as the other cases.
And to me it suggests that well established projects, which have thought out it all, aren't keen to use more indentation than really needed.
-- Greg
[] -- Best regards, Paul mailto:pmiscml@gmail.com
On 07/11/2020 04:20 AM, Paul Sokolovsky wrote:
Actually, the whole argument in PEP 622 regarding "else:", that its placement is ambiguous sounds like a rather artificial write-off. Individual "case"'s are aligned together, but suddenly, it's unclear how to align the default case, introduced by "else"? Who in good faith would align it with "match"?
I would. -- ~Ethan~
On Sat, 11 Jul 2020 at 14:39, Ethan Furman <ethan@stoneleaf.us> wrote:
On 07/11/2020 04:20 AM, Paul Sokolovsky wrote:
Actually, the whole argument in PEP 622 regarding "else:", that its placement is ambiguous sounds like a rather artificial write-off. Individual "case"'s are aligned together, but suddenly, it's unclear how to align the default case, introduced by "else"? Who in good faith would align it with "match"?
I would.
How do you feel about the fact that match EXPR: case 1: print("One") case _: print("default") and match EXPR: case 1: print("One") else: print("default") are semantically completely identical, but syntactically must be indented differently? That for me is probably the most compelling reason for preferring to indent else to the same level as case¹. I'm curious to understand how people who prefer aligning else with match view this. (Not least, because I anticipate some "interesting" code style flamewars over this ;-)) Paul ¹ When I say "most compelling" I mean "inclines me to have a mild preference" :-) In reality I mostly don't care, and I'll probably just use "case _" in any projects I work on and ignore the existence of "else" altogether.
To me, "else:" has a slightly different meaning than "case _:" case _: essentially a default, ensuring that the match logic is complete. else: OK, the subject of this match failed, here is our fallback logic. Whether this distinction is important enough to express in code is another question, as is whether or not anyone but me would follow this "obvious" convention. So I'm not convinced the difference justifies the existence a second syntax. But I'm also not sure it doesn't, particularly if that distinction were given in the PEP and in documentation for the match statement. -jJ
On 07/11/2020 10:29 AM, Jim J. Jewett wrote:
To me, "else:" has a slightly different meaning than "case _:"
case _: essentially a default, ensuring that the match logic is complete.
else: OK, the subject of this match failed, here is our fallback logic.
Whether this distinction is important enough to express in code is another question, as is whether or not anyone but me would follow this "obvious" convention. So I'm not convinced the difference justifies the existence a second syntax. But I'm also not sure it doesn't, particularly if that distinction were given in the PEP and in documentation for the match statement.
This is exactly how I would use it. -- ~Ethan~
On Sat, Jul 11, 2020 at 2:45 PM Ethan Furman <ethan@stoneleaf.us> wrote:
On 07/11/2020 10:29 AM, Jim J. Jewett wrote:
To me, "else:" has a slightly different meaning than "case _:"
case _: essentially a default, ensuring that the match logic is complete.
else: OK, the subject of this match failed, here is our fallback logic.
Whether this distinction is important enough to express in code is another question, as is whether or not anyone but me would follow this "obvious" convention. So I'm not convinced the difference justifies the existence a second syntax. But I'm also not sure it doesn't, particularly if that distinction were given in the PEP and in documentation for the match statement.
This is exactly how I would use it.
Hm... Just the fact that people have been arguing both sides so convincingly makes me worry that something bigger is amiss. I think we're either better off without `else` (since the indentation of `case _` cannot be disputed :-), or we have to revisit the reasons for indenting `case` relative to `match`. As MRAB said, it's a case of picking the least inelegant one. Let me add that the parser can easily deal with whatever we pick -- this is purely about human factors. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 07/11/2020 03:31 PM, Guido van Rossum wrote:
On Sat, Jul 11, 2020 at 2:45 PM Ethan Furman wrote:
On 07/11/2020 10:29 AM, Jim J. Jewett wrote:
To me, "else:" has a slightly different meaning than "case _:"
case _: essentially a default, ensuring that the match logic is complete.
else: OK, the subject of this match failed, here is our fallback logic.
Whether this distinction is important enough to express in code is another question, as is whether or not anyone but me would follow this "obvious" convention. So I'm not convinced the difference justifies the existence a second syntax. But I'm also not sure it doesn't, particularly if that distinction were given in the PEP and in documentation for the match statement.
This is exactly how I would use it.
Hm... Just the fact that people have been arguing both sides so convincingly makes me worry that something bigger is amiss.
I think it just means there are several "right" ways to do it, we just need to pick one. I'll be happy to use whatever we end up with*. -- ~Ethan~ * I hope. ;-)
On 7/11/2020 6:31 PM, Guido van Rossum wrote:
Hm... Just the fact that people have been arguing both sides so convincingly makes me worry that something bigger is amiss. I think we're either better off without `else` (since the indentation of `case _` cannot be disputed :-), or we have to revisit the reasons for indenting `case` relative to `match`. As MRAB said, it's a case of picking the least inelegant one.
The more I think about adjusting IDLE's smart indenter to indent case suites once, with 2 half size indents, the more I would prefer to special case 'match' to have no indent (if there were no suite allowed) than special case 'match' and 'case' to indent 1/2 Indent. Problem 1. Indent increments can be set to an odd number of spaces! (.rst files use 3 space indents.) Problem 2. Dedent (with Backspace) could no longer be the simple expression I expect it currently is, such as new-indent = current-indent spaces // current indent size. IDLE would have to search backwards to find the appropriate header line, which might not be the most recent one. A stack of indent (delta,size) pairs would have to be recalculated from the most recent non-indented compound header line when the cursor is moved around. Note that current indent may not be an indent size multiple number due to PEP8 vertical alignments, such as with function parameters or arguments.3 if a: def fg(param1, # comment param2, # 11 space indent | # Backspace moves cursor left 3 spaces. The flexibility of the peg parser needs to be used with care because it can allow constructs that are difficult for people and non-peg code to read and process. -- Terry Jan Reedy
On 07/11/2020 10:29 AM, Jim J. Jewett wrote:
To me, "else:" has a slightly different meaning than "case _:"
case _: essentially a default, ensuring that the match logic is complete.
else: OK, the subject of this match failed, here is our fallback logic.
Is there anywhere else where Python goes out of its way to provide two ways of doing the same thing just because they might feel semantically different? -- Greg
On 11/07/20 11:20 pm, Paul Sokolovsky wrote:
On Sat, 11 Jul 2020 22:49:09 +1200 Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
or like this:
switch (x) { case 1: ... case 2: ... default: ... }
Oh really, you never saw that? Well, they say that any programmer should eyeball the source code of the most popular open-source OS at least once: https://elixir.bootlin.com/linux/latest/source/kernel/sys.c#L2144
I stand corrected -- it seems I haven't looked at other people's switch statements all that much. I can see that being a reasonable choice if you're using 8-space indents, but I don't see that done much in Python. Also, directly translating this into Python leads to something that looks like a mistake: match x: case 1: ... case 2: ... and as has been pointed out, the alternative of putting x on the next line is unprecedented in Python. -- Greg
On 7/11/2020 7:54 PM, Greg Ewing wrote:
I can see that being a reasonable choice if you're using 8-space indents, but I don't see that done much in Python.
Also, directly translating this into Python leads to something that looks like a mistake:
match x: case 1: ... case 2: ...
and as has been pointed out, the alternative of putting x on the next line is unprecedented in Python.
If the 2 levels of indenting are really offensive, surely we could teach editors, black, ourselves, etc. to indent the match statement as: match pt: case (x, y): # <-- indent by two spaces return Point3d(x, y, 0) # <-- indent by 2 more spaces, for a total of 4 if x: return x # <-- normally indent by 4 spaces I used to do something similar with C switch statements. I guess you couldn't use this trick if you were using tabs. Another reason to not use them! Eric
On 11/07/2020 12:20, Paul Sokolovsky wrote:
Actually, the whole argument in PEP 622 regarding "else:", that its placement is ambiguous sounds like a rather artificial write-off. Individual "case"'s are aligned together, but suddenly, it's unclear how to align the default case, introduced by "else"? Who in good faith would align it with "match"?
I would, if I'd used an "else" with a "for" recently. I would have a strong tendency to align the "else" with the "case" statements, but I can see how the other way around makes sense too. (I can't see how anyone likes the Linux case indentation style at all. It's horrible to read.) -- Rhodri James *-* Kynesim Ltd
Given that case will be a keyword, what's the case (pun unintentional) for indenting the case clauses? What's wrong with 'case' and 'else' both indented the same as match? Without the keyword there'd be a case for indenting, but with it I don't see the necessity. Kind regards, Steve On Fri, Jul 10, 2020 at 12:09 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A thought about the indentation level of a speculated "else" clause...
Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements.
However, in those statements, all the actual code belonging to the statement is indented to the same level:
if a: .... elif b: .... else: ....
^ | Code all indented to this level
But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest.
match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels
This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement.
-- Greg _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ACH4QXCU... Code of Conduct: http://python.org/psf/codeofconduct/
On 2020-07-11 20:56, Steve Holden wrote:
Given that case will be a keyword, what's the case (pun unintentional) for indenting the case clauses? What's wrong with 'case' and 'else' both indented the same as match? Without the keyword there'd be a case for indenting, but with it I don't see the necessity.
Kind regards, Steve
The issue is as follows. In Python, the style is that multi-line statements start with a keyword and that logical line ends with a colon. The next line is a statement, which is indented. The other parts of the statement structure are indented the same amount as its first line: if ...: ... elif ...: ... else: ... The 'match' statement (or a 'switch' statement), however, can't follow the existing style. It's either: match ...: case ...: case ...: where the second line isn't indented (unlike all other structures), or: match ...: case ...: case ...: where the other parts of the structure (the cases) aren't indented the same as the first line. Another possibility is: match: ... case ...: case ...: but the second line is an expression, not a statement (unlike all other structures). An alternative is: match ... case ...: case ...: no colon ending the first line, and no indenting of the second line, but that's unlike all other structures too. None of the possibilities are formatted like the existing style. So it's a case of picking the least inelegant one.
On Fri, Jul 10, 2020 at 12:09 PM Greg Ewing <greg.ewing@canterbury.ac.nz <mailto:greg.ewing@canterbury.ac.nz>> wrote:
A thought about the indentation level of a speculated "else" clause...
Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements.
However, in those statements, all the actual code belonging to the statement is indented to the same level:
if a: .... elif b: .... else: ....
^ | Code all indented to this level
But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest.
match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels
This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement.
On Jul 11, 2020, at 13:28, MRAB <python@mrabarnett.plus.com> wrote:
Another possibility is:
match: ... case ...: case ...:
It’s ugly, but you could introduce and require a (soft) keyword on the line after match, e.g. match: # Can’t really use `with` here although I think it reads better. as expression case … I still wish cases lined up under match, but it’s not a deal breaker for me. -Barry
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented? My apologies for being a Bear of Very Little Brain. On Fri, Jul 10, 2020 at 12:09 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
A thought about the indentation level of a speculated "else" clause...
Some people have argued that "else" should be at the outer level, because that's the way it is in all the existing compound statements.
However, in those statements, all the actual code belonging to the statement is indented to the same level:
if a: .... elif b: .... else: ....
^ | Code all indented to this level
But if we were to indent "else" to the same level as "match", the code under it would be at a different level from the rest.
match a: case 1: .... case 2: .... else: .... ^ ^ | | Code indented to two different levels
This doesn't seem right to me, because all of the cases, including the else, are on the same footing semantically, just as they are in an "if" statement.
-- Greg _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ACH4QXCU... Code of Conduct: http://python.org/psf/codeofconduct/
On 16/07/2020 17:37, Steve Holden wrote:
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented?
My take on the difference would be that "try" tries out a suite, while "match" matches an expression. If we did: match: <expression> case <pattern>: <suite> then having an indented section which must be a single expression would be unique in Python syntax. I could easily see people being confused when the slew of statements they would inevitably decide they must be able to put there, and soon we'd have cats and dogs living together and the downfall of civilisation as we know it. Alternatively: match <expression>: case <pattern>: <suite> would be the one place in Python where you end a line with a colon and *don't* indent the following line. Writers of simple formatters and the like (such as Python-mode in Emacs) would curse your name, etc, etc.
My apologies for being a Bear of Very Little Brain.
Nah, don't apologise. This is one of those things that everyone has opinions on, because there doesn't seem to be an obvious Right Answer. -- Rhodri James *-* Kynesim Ltd
On 16/07/2020 19:00, Rhodri James wrote:
On 16/07/2020 17:37, Steve Holden wrote:
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented? [...] If we did:
match: <expression> case <pattern>: <suite>
then having an indented section which must be a single expression would be unique in Python syntax.
[...]
match <expression>: case <pattern>: <suite>
would be the one place in Python where you end a line with a colon and **don't** indent the following line.
It seems relevant to mention that before Python's unique syntax for a ternary operator (x = something if something else default_value), you would never find an if or else without a colon and an indented block, and the order (value, condition, default) is different than what is usually found in other languages (condition, value, default). That hasn't stopped Python from implementing the feature in its own way and I don't see why this PEP should be different, since Python is not other languages and match suites are not other suites.
I could easily see people being confused when the slew of statements they would inevitably decide they must be able to put there, and soon we'd have cats and dogs living together and the downfall of civilisation as we know it. [...] Writers of simple formatters and the like (such as Python-mode in Emacs) would curse your name, etc, etc. Tools should adapt to the language, not the other way around. If things had to be done the way they had always been done, without any change, for fear of people not being used to it, we wouldn't even have Python at all. People learn and adapt. It seems like a small price to pay in exchange for consistency and removal of ambiguity, considering people will still have to learn the new feature one way or another.
On Fri, Jul 17, 2020 at 3:25 AM Federico Salerno <salernof11@gmail.com> wrote:
Tools should adapt to the language, not the other way around. If things had to be done the way they had always been done, without any change, for fear of people not being used to it, we wouldn't even have Python at all. People learn and adapt. It seems like a small price to pay in exchange for consistency and removal of ambiguity, considering people will still have to learn the new feature one way or another.
But consistency is exactly what you'd be destroying here. Python is extremely consistent in that you ALWAYS indent after a line ends with a colon, and what comes after it is logically contained within that statement. It's about whether *people* can handle it, more than whether tools can, and the consistency helps a lot with that. ChrisA
On Jul 16, 2020, at 1:36 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Jul 17, 2020 at 3:25 AM Federico Salerno <salernof11@gmail.com> wrote:
Tools should adapt to the language, not the other way around. If things had to be done the way they had always been done, without any change, for fear of people not being used to it, we wouldn't even have Python at all. People learn and adapt. It seems like a small price to pay in exchange for consistency and removal of ambiguity, considering people will still have to learn the new feature one way or another.
But consistency is exactly what you'd be destroying here. Python is extremely consistent in that you ALWAYS indent after a line ends with a colon, and what comes after it is logically contained within that statement. It's about whether *people* can handle it, more than whether tools can, and the consistency helps a lot with that.
ChrisA
One question that comes to mind, does match NEED a colon at the end of it? If it didn’t have the colon, it wouldn’t need the indent for the following lines.
On 2020-07-16 19:05, Richard Damon wrote:
On Jul 16, 2020, at 1:36 PM, Chris Angelico <rosuav@gmail.com> wrote:
On Fri, Jul 17, 2020 at 3:25 AM Federico Salerno <salernof11@gmail.com> wrote:
Tools should adapt to the language, not the other way around. If things had to be done the way they had always been done, without any change, for fear of people not being used to it, we wouldn't even have Python at all. People learn and adapt. It seems like a small price to pay in exchange for consistency and removal of ambiguity, considering people will still have to learn the new feature one way or another.
But consistency is exactly what you'd be destroying here. Python is extremely consistent in that you ALWAYS indent after a line ends with a colon, and what comes after it is logically contained within that statement. It's about whether *people* can handle it, more than whether tools can, and the consistency helps a lot with that.
ChrisA
One question that comes to mind, does match NEED a colon at the end of it? If it didn’t have the colon, it wouldn’t need the indent for the following lines.
All of the others have the colon.
On Thu, 16 Jul 2020 at 19:09, Richard Damon <Richard@damon-family.org> wrote:
One question that comes to mind, does match NEED a colon at the end of it? If it didn’t have the colon, it wouldn’t need the indent for the following lines.
Or something like match t case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) else: pass -- Kind regards, Stefano Borini
On 7/16/2020 10:00 AM, Rhodri James wrote:
On 16/07/2020 17:37, Steve Holden wrote:
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented?
My take on the difference would be that "try" tries out a suite, while "match" matches an expression. If we did:
match: <expression> case <pattern>: <suite>
then having an indented section which must be a single expression would be unique in Python syntax. I could easily see people being confused when the slew of statements they would inevitably decide they must be able to put there, and soon we'd have cats and dogs living together and the downfall of civilisation as we know it.
Alternatively:
match <expression>: case <pattern>: <suite>
would be the one place in Python where you end a line with a colon and *don't* indent the following line. Writers of simple formatters and the like (such as Python-mode in Emacs) would curse your name, etc, etc.
My apologies for being a Bear of Very Little Brain.
Nah, don't apologise. This is one of those things that everyone has opinions on, because there doesn't seem to be an obvious Right Answer.
Speaking of things unique in Python syntax, isn't double-indention of a single control flow structure introduced by match also unique?
On 2020-07-16 17:37, Steve Holden wrote:
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented?
My apologies for being a Bear of Very Little Brain.
[snip] For all other statement structures (if, while, try, etc.), the first line ends with a colon and the second line is indented (and is a statement). Therefore the cases should be indented. However, for all other statement structures (if, while, try, etc.), the other parts of the structure itself (elif, else, except, etc.) aren't indented. Therefore the cases shouldn't be indented. Either way, it's inelegant.
On 16/07/2020 18:13, MRAB wrote:
On 2020-07-16 17:37, Steve Holden wrote:
While I understand the point of view that says that match ... : should encapsulate a sequence of indented suites, it seems to me that match/case/case/.../else has a natural affinity with try/except/except/.../finally/else, and nobody seems to think that the excepts should be indented. Or the finally. And naturally the match/else case are at the same indentation level, just as for/else, while/else and try/finally. So why, exactly, should case be indented?
My apologies for being a Bear of Very Little Brain.
[snip] For all other statement structures (if, while, try, etc.), the first line ends with a colon and the second line is indented (and is a statement). Therefore the cases should be indented.
However, for all other statement structures (if, while, try, etc.), the other parts of the structure itself (elif, else, except, etc.) aren't indented. Therefore the cases shouldn't be indented.
Either way, it's inelegant. Absolutely true. However:
I think that equal indentation suggests suites of equal status. Increased indentation suggests suites are subordinate to a previous suite. Consider these examples: (1) if..elif...else: Suites are definitely co-equal (they are alternatives, only one of which is chosen). So if/elif/else have equal indentation. (2) for...else or while...else: It's arguable IMO (YMMV); Python has chosen to indent them equally. I don't think it would have been outrageous to indent the 'else:'; one reason not to is that the 'else' might not stand out (it would typically be indented equally with the preceding line of code), unless it was allowed/forced to be indented *less than* 'for'. (3) try...except...else: IMO also arguable (at most one of 'except' and 'else' can be executed). (4) try...finally: The suites have equal status (both are always executed), so they have equal indentation. Now to me, 'case' clauses are *subordinate* to 'match'. After all, the value after 'match' applies to all the following 'case's. So I would argue in favour of indenting 'case' statements. This would make them stand out *more* (making a virtue out of 'match' *not* being followed by an indented suite). A Good Thing. One of the purposes of indentation is to make program structure clearly visible, which IMO this would do. Or from a slightly different point of view: "match...case...case..." is a single construct, a self-contained program component. Indenting the 'case's would stop them from distracting visually from other suites that are indented the same as 'match'. (I.e. other program components of equal status.) (If 'else' were allowed after 'match', I would argue that it should also be indented, for the same reasons, and because it is logically a case, albeit a special one.) Rob Cliffe
On Fri, 10 Jul 2020 at 10:33, Glenn Linderman <v+python@g.nevcal.com> wrote:
On 7/10/2020 1:21 AM, Stefano Borini wrote:
Just my 2 cents, I find it kind of annoying that the whole structure requires two levels of indentation to actually reach the operational code. This would be a first in python.
I would prefer an option akin to if elif elif else where each block is only one level deep. Me too.
That would also sidestep the dilemma of whether else: (if implemented) should be indented like case: or like match: because they would be the same.
match: t case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None
but it does make the match: block not a statement group, which was disturbing to some.
On the other hand, this has a correspondence to:
try: throw expression except (type of expression) as exc1: blah blah1 except (another type) as exc2: blah blah2 else: blah blah3
The problem of the try...except structure, with less indentation, is that, yes, it is OK for exceptions because normally you have 2 or 3 `except XXX` clauses, therefore it is usually easy to follow, if the number of vertical lines in the entire block of try-catch is low enough. But I have had cases with catching many exception types, each with its own block of 4 or 5 lines, adding up to a block of try-excepts that doesn't even fit in a single window of my editor. In that case, I always have wished for except clauses to be extra indented, to more easily distinguish where the try..except block ends. Therefore, I posit that the style of try...except indentation only works where the number of cases is small. But for the case of pattern matching, I expect the number of cases to be matched to be a lot higher than exception handling cases. Having cases to be matched be indented is, IMHO, a nice visual cue to help the reader understand where the pattern matching block ends.
In fact, one _could_ wrap this whole feature into the try: syntax... the match statement would be tried, and the cases would be special types of exception handlers:
try: match expression case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None
If the expression could fail to be calculated, one could have a mix of except clauses also to catch those, rather than needing to wrap the whole match expression in a separate try to handle that case [making the nesting even deeper :( ]
There might even be a use for using case clauses to extend "normal" exception handling, where the exception object could be tested for its content as well as its class to have different handling.
try: raise Exception("msg", 35, things) case Exception( x, "widgets"): blah blah 1 case Exception( x, "characters"): blah blah 2 else: blah blah 3
In this not-fully-thought-through scenario, maybe the keyword match isn't even needed: "raise expression" could do the job, or they could be aliases to signify intent.
In other words, a match expression would always "fail". The only mismatch here is that it points out the difference between try-else and match-else: try-else is executed if there is no failure, but if match always fails, else would never be appropriate, and case _: would be.
In any case, it does seem there is a strong correlation between match processing and try processing, that I didn't see during other discussions of the possible structural similarities. "match 3 / 0:" would clearly need to be wrapped in a try:
try: match x / y: case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case _: print("You get what you get") except ZeroDivisionError as exc: print(f"But sometimes you get an exception {exc}")
or:
try: raise x / y case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case exc := ZeroDivisionError: print(f"But sometimes you get an exception: {exc}") case _: print("You get what you get") _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GDP2KKB3... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
On 7/10/2020 3:15 AM, Gustavo Carneiro wrote:
On Fri, 10 Jul 2020 at 10:33, Glenn Linderman <v+python@g.nevcal.com <mailto:v%2Bpython@g.nevcal.com>> wrote:
On 7/10/2020 1:21 AM, Stefano Borini wrote: > Just my 2 cents, I find it kind of annoying that the whole structure > requires two levels of indentation to actually reach the operational > code. > This would be a first in python. > > I would prefer an option akin to if elif elif else where each block is > only one level deep. Me too.
That would also sidestep the dilemma of whether else: (if implemented) should be indented like case: or like match: because they would be the same.
match: t case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None
but it does make the match: block not a statement group, which was disturbing to some.
On the other hand, this has a correspondence to:
try: throw expression except (type of expression) as exc1: blah blah1 except (another type) as exc2: blah blah2 else: blah blah3
The problem of the try...except structure, with less indentation, is that, yes, it is OK for exceptions because normally you have 2 or 3 `except XXX` clauses, therefore it is usually easy to follow, if the number of vertical lines in the entire block of try-catch is low enough.
But I have had cases with catching many exception types, each with its own block of 4 or 5 lines, adding up to a block of try-excepts that doesn't even fit in a single window of my editor. In that case, I always have wished for except clauses to be extra indented, to more easily distinguish where the try..except block ends.
Therefore, I posit that the style of try...except indentation only works where the number of cases is small.
But for the case of pattern matching, I expect the number of cases to be matched to be a lot higher than exception handling cases. Having cases to be matched be indented is, IMHO, a nice visual cue to help the reader understand where the pattern matching block ends.
Actually, the current if elseif elseif elseif else, used now because Python has no switch/match/case, has exactly the same issue as you describe as a problem with try if there were more cases... and if often has more cases, just like match will. So your concern seems nebulous. You may have wished for extra indentation.... but it is simple to get more indentation: use 8 spaces instead of 4. So if you really wanted it, you could have had it. It is much harder to get less indentation when the language structures prescribe it.
In fact, one _could_ wrap this whole feature into the try: syntax... the match statement would be tried, and the cases would be special types of exception handlers:
try: match expression case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex( r* cos(phi), r*sin(phi) else: return None
If the expression could fail to be calculated, one could have a mix of except clauses also to catch those, rather than needing to wrap the whole match expression in a separate try to handle that case [making the nesting even deeper :( ]
There might even be a use for using case clauses to extend "normal" exception handling, where the exception object could be tested for its content as well as its class to have different handling.
try: raise Exception("msg", 35, things) case Exception( x, "widgets"): blah blah 1 case Exception( x, "characters"): blah blah 2 else: blah blah 3
In this not-fully-thought-through scenario, maybe the keyword match isn't even needed: "raise expression" could do the job, or they could be aliases to signify intent.
In other words, a match expression would always "fail". The only mismatch here is that it points out the difference between try-else and match-else: try-else is executed if there is no failure, but if match always fails, else would never be appropriate, and case _: would be.
In any case, it does seem there is a strong correlation between match processing and try processing, that I didn't see during other discussions of the possible structural similarities. "match 3 / 0:" would clearly need to be wrapped in a try:
try: match x / y: case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case _: print("You get what you get") except ZeroDivisionError as exc: print(f"But sometimes you get an exception {exc}")
or:
try: raise x / y case 43: print("wow, it is 43") case 22: print("22 seemed less likely than 43 for some reason") case exc := ZeroDivisionError: print(f"But sometimes you get an exception: {exc}") case _: print("You get what you get") _______________________________________________ Python-Dev mailing list -- python-dev@python.org <mailto:python-dev@python.org> To unsubscribe send an email to python-dev-leave@python.org <mailto:python-dev-leave@python.org> https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GDP2KKB3... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
Glenn Linderman wrote:
On 7/10/2020 3:15 AM, Gustavo Carneiro wrote: ...
Therefore, I posit that the style of try...except indentation only works where the number of cases is small. But for the case of pattern matching, I expect the number of cases to be matched to be a lot higher than exception handling cases. Having cases to be matched be indented is, IMHO, a nice visual cue to help the reader understand where the pattern matching block ends.
Actually, the current if elseif elseif elseif else, used now because Python has no switch/match/case, has exactly the same issue as you describe as a problem with try if there were more cases... and if often has more cases, just like match will.
True
So your concern seems nebulous. You may have wished for extra indentation.... but it is simple to get more indentation: use 8 spaces instead of 4. So if you really wanted it, you could have had it.
Not so true. if a: ... elif b: ... elif c: ... else: is not valid. In practice, I interpret wanting to indent at a place that doesn't require it as a code smell suggesting I should try to break out a helper function, but ... it does happen. -jJ
On 7/11/2020 10:36 AM, Jim J. Jewett wrote:
Glenn Linderman wrote:
On 7/10/2020 3:15 AM, Gustavo Carneiro wrote: ...
Therefore, I posit that the style of try...except indentation only works where the number of cases is small. But for the case of pattern matching, I expect the number of cases to be matched to be a lot higher than exception handling cases. Having cases to be matched be indented is, IMHO, a nice visual cue to help the reader understand where the pattern matching block ends. Actually, the current if elseif elseif elseif else, used now because Python has no switch/match/case, has exactly the same issue as you describe as a problem with try if there were more cases... and if often has more cases, just like match will. True
So your concern seems nebulous. You may have wished for extra indentation.... but it is simple to get more indentation: use 8 spaces instead of 4. So if you really wanted it, you could have had it. Not so true.
It is true, the way I meant it, but not the way you (mis-)interpreted it.
if a: ... elif b: ... elif c: ... else:
is not valid. In practice, I interpret wanting to indent at a place that doesn't require it as a code smell suggesting I should try to break out a helper function, but ... it does happen.
if a: ... elif b: ... elif c: ... else: ... is perfectly valid.
I don't think anyone has asked for more indentation in that sense; it has only even come up in a suggestion of less. (2-space indents for case instead of 4) People do disagree about whether or not case statements (and possibly an else statement) should be indented more than 0 spaces compared to match.
One small question about this part of the PEP:
For the most commonly-matched built-in types (bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple), a single positional sub-pattern is allowed to be passed to the call. Rather than being matched against any particular attribute on the subject, it is instead matched against the subject itself.
Correct me if I'm wrong, but I don't think the PEP currently gives us a way of enabling this behavior for classes not on this list. If so, would it be worth adding a way? It would help remove a special case and could come in handy when doing things like creating my own custom data structures, for example. After all, if `case dict(x)` makes x match the entire dict, it would be nice if I could make `case MyCustomMapping(x)` behave in the same way to keep the usage consistent. We could maybe let classes opt-in to this behavior if they define `__match_args__ = None`? Not sure if adding the extra "is None" check when doing the match will introduce too much overhead though. -- Michael On Wed, Jul 8, 2020 at 8:06 AM Guido van Rossum <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases.
This discussion is not new to the authors: before we ever published the first version of the PEP we vigorously debated this (it is Issue 1 in our tracker!), and other languages before us have also had to come to grips with it. Even many statically compiled languages! The reason is that for reasons of usability it’s usually deemed important that their equivalent of `case` auto-declare the captured variables, and variable declarations may hide (override) like-named variables in outer scopes.
Scala, for example, uses several different rules: first, capture variable names must start with a lowercase letter (so it would handle the above example as intended); next, capture variables cannot be dotted names (like `mod.var`); finally, you can enclose any variable in backticks to force the compiler to see it as a load instead of a store. Elixir uses another form of markup for loads: `x` is a capture variable, but `^x` loads and compares the value of `x`.
There are a number of dead ends when looking for a solution that works for Python. Checking at runtime whether a name is defined or not is one of these: there are numerous reasons why this could be confusing, not the least of which being that the `match` may be executed in a loop and the variable may already be bound by a previous iteration. (True, this has to do with the scope we’ve adopted for capture variables. But believe me, giving each case clause its own scope is quite horrible by itself, and there are other action-at-a-distance effects that are equally bad.)
It’s been proposed to explicitly state the names of the variables bound in a header of the `match` statement; but this doesn’t scale when the number of cases becomes larger, and requires users to do bookkeeping the compiler should be able to do. We’re really looking for a solution that tells you when you’re looking at an individual `case` which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. `$x` or `x?`) or other markup (e.g. backticks or `<x>`) makes this common case ugly and inconsistent: it’s unpleasant to see for example ``` case %x, %y: print(x, y) ``` No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`.
We believe it’s acceptable that things looking like `mod.var` are never considered capture variables -- the common use cases for `match` are such that one would almost never want to capture into a different namespace. (Just like you very rarely see `for self.i in …` and never `except E as scope.var` -- the latter is illegal syntax and sets a precedent.)
One author would dearly have seen Scala’s uppercase rule adopted, but in the end was convinced by the others that this was a bad idea, both because there’s no precedent in Python’s syntax, and because many human languages simply don’t make the distinction between lowercase and uppercase in their writing systems.
So what should you do if you have a local variable (say, a function argument) that you want to use as a value in a pattern? One solution is to capture the value in another variable and use a guard to compare that variable to the argument: ``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ``` If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case.
There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question.
Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.
If you've made it so far, here are the links to check out, with an open mind. As a reminder, the introductory sections (Abstract, Overview, and Rationale and Goals) have been entirely rewritten and also serve as introduction and tutorial.
- PEP 622: https://www.python.org/dev/peps/pep-0622/ - Playground: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playgrou...
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LOXEATGF... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Jul 10, 2020 at 9:54 AM Michael Lee <michael.lee.0x2a@gmail.com> wrote:
One small question about this part of the PEP:
For the most commonly-matched built-in types (bool, bytearray, bytes, dict, float, frozenset, int, list, set, str, and tuple), a single positional sub-pattern is allowed to be passed to the call. Rather than being matched against any particular attribute on the subject, it is instead matched against the subject itself.
Correct me if I'm wrong, but I don't think the PEP currently gives us a way of enabling this behavior for classes not on this list. If so, would it be worth adding a way?
It would help remove a special case and could come in handy when doing things like creating my own custom data structures, for example. After all, if `case dict(x)` makes x match the entire dict, it would be nice if I could make `case MyCustomMapping(x)` behave in the same way to keep the usage consistent.
We could maybe let classes opt-in to this behavior if they define `__match_args__ = None`? Not sure if adding the extra "is None" check when doing the match will introduce too much overhead though.
-- Michael
Hi MIchael, There is a way to do this. A class could do this: ``` class C: __match_args__ = ["__self__"] @property def __self__(self): return self ``` (You can use any name for `__self__`.) I realize this isn't particularly pretty, but we feel it's better not to add a custom `__match__` protocol at this time: The design space for that is itself quite large, and we realized that almost all "easy" applications could be had without it, while the "complicated" applications were all trying to get the `__match__` protocol to do different things. Also, beware that if your class does this, it is stuck with this form -- if you replace `["__self__"]` with some other set of arguments, user code that is matching against your class will presumably break. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 7/8/20 8:02 AM, Guido van Rossum wrote:
Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.
In that case, I'd like to make a specific pitch for "don't make '_' special". (I'm going to spell it '_' as it seems to be easier to read this way; ignore the quotes.) IIUC '_' is special in two ways: 1) we permit it to be used more than once in a single pattern, and 2) if it matches, it isn't bound. If we forego these two exceptions, '_' can go back to behaving like any other identifier. It becomes an idiom rather than a special case. Drilling down on what we'd need to change: To address 1), allow using a name multiple times in a single pattern. 622 v2 already says: For the moment, we decided to make repeated use of names within the same pattern an error; we can always relax this restriction later without affecting backwards compatibility. If we relax it now, then we don't need '_' to be special in this way. All in all this part seems surprisingly uncontentious. To address 2), bind '_' when it's used as a name in a pattern. This adds an extra reference and an extra store. That by itself seems harmless. The existing implementation has optimizations here. If that's important, we could achieve the same result with a little dataflow analysis to optimize away the dead store. We could even special-case optimizing away dead stores /only/ to '_' and /only/ in match/case statements and all would be forgiven. Folks point out that I18N code frequently uses a global function named '_'. The collision of these two uses is unfortunate, but I think it's survivable. I certainly don't think this collision means we should special-case this one identifier in this one context in the /language/ specification. Consider: * There's no installed base of I18N code using pattern matching, because it's a new (proposed!) syntax. Therefore, any I18N code that wants to use match/case statements will be new code, and so can be written with this (admittedly likely!) collision in mind. I18N code could address this in several ways, for example: o Mandate use of an alternate name for "don't care" match patterns in I18N code, perhaps '__' (two underscores). This approach seems best. o Use a different name for the '_' function in scopes where you're using match/case, e.g. 'gettext'. o Since most Python code lives inside functions, I18N code could use '_' in its match/case statements, then "del _" after the match statement. '_' would revert back to finding the global function. (This wouldn't work for code at module scope for obvious reasons. One /could/ simply rebind '_', but I doubt people want to consider this approach in the first place.) * As the PEP mentions, '_' is already a Python idiom for "I don't care about this value", e.g. "basename, _, extension = filename.partition('.')". I18N has already survived contact with this idiom. * Similarly, '_' has a special meaning in the Python REPL. Admittedly, folks don't use a lot of I18N work in the REPL, so this isn't a problem in practice. I'm just re-making the previous point: I18N programmers already cope with other idiomatic uses of '_'. * Static code analyzers could detect if users run afoul of this collision. "Warning: match/case using _ in module using _ for gettext" etc. One consideration: if you /do/ use '_' multiple times in a single pattern, and you /do/ refer to its value afterwards, what value should it get? Consider that Python already permits multiple assignments in a single expression: (x:="first", x:="middle", x:="last") After this expression is evaluated, x has been bound to the value "last". I could live with "it keeps the rightmost". I could also live with "the result is implementation-defined". I suspect it doesn't matter much, because the point of the idiom is that people don't care about the value. In keeping with this change, I additionally propose removing '*_' as a special token. '*_' would behave like any other '*identifier', binding the value to the unpacked sequence. Alternately, we could keep the special token but change it to '*' so it mirrors Python function declaration syntax. I don't have a strong opinion about this second alternative. Cheers, //arry/
On 12/07/2020 11:38, Larry Hastings wrote:
In that case, I'd like to make a specific pitch for "don't make '_' special". (I'm going to spell it '_' as it seems to be easier to read this way; ignore the quotes.)
IIUC '_' is special in two ways:
1) we permit it to be used more than once in a single pattern, and 2) if it matches, it isn't bound.
If we forego these two exceptions, '_' can go back to behaving like any other identifier. It becomes an idiom rather than a special case.
Drilling down on what we'd need to change:
To address 1), allow using a name multiple times in a single pattern.
622 v2 already says: [...]
If we relax it now, then we don't need '_' to be special in this way. All in all this part seems surprisingly uncontentious.
To address 2), bind '_' when it's used as a name in a pattern.
This adds an extra reference and an extra store. That by itself seems harmless.
The existing implementation has optimizations here. If that's important, we could achieve the same result with a little dataflow analysis to optimize away the dead store. We could even special-case optimizing away dead stores /only/ to '_' and /only/ in match/case statements and all would be forgiven.
Folks point out that I18N code frequently uses a global function named '_'. The collision of these two uses is unfortunate, but I think it's survivable. I certainly don't think this collision means we should special-case this one identifier in this one context in the /language/ specification.
Consider:
* There's no installed base of I18N code using pattern matching, because it's a new (proposed!) syntax. Therefore, any I18N code that wants to use match/case statements will be new code, and so can be written with this (admittedly likely!) collision in mind. I18N code could address this in several ways, for example: o Mandate use of an alternate name for "don't care" match patterns in I18N code, perhaps '__' (two underscores). This approach seems best.
In keeping with this change, I additionally propose removing '*_' as a special token. '*_' would behave like any other '*identifier', binding the value to the unpacked sequence. Alternately, we could keep the special token but change it to '*' so it mirrors Python function declaration syntax. I don't have a strong opinion about this second alternative.
+1 to everything
One consideration: if you /do/ use '_' multiple times in a single pattern, and you /do/ refer to its value afterwards, what value should it get? Consider that Python already permits multiple assignments in a single expression:
(x:="first", x:="middle", x:="last")
After this expression is evaluated, x has been bound to the value "last". I could live with "it keeps the rightmost". I could also live with "the result is implementation-defined". I suspect it doesn't matter much, because the point of the idiom is that people don't care about the value.
I'd expect it to bind to the last one. If that's in any way problematic, in order to prevent oblivious misuse, referencing an identifier that was bound more than once should raise an exception. But as you say, it doesn't really matter.
On Sun, 12 Jul 2020 at 10:47, Larry Hastings <larry@hastings.org> wrote:
In that case, I'd like to make a specific pitch for "don't make '_' special". (I'm going to spell it '_' as it seems to be easier to read this way; ignore the quotes.)
Overall, this sounds mostly reasonable. I'm cutting nearly everything here, because I don't have anything to add.
One consideration: if you do use '_' multiple times in a single pattern, and you do refer to its value afterwards, what value should it get? Consider that Python already permits multiple assignments in a single expression:
(x:="first", x:="middle", x:="last")
After this expression is evaluated, x has been bound to the value "last". I could live with "it keeps the rightmost". I could also live with "the result is implementation-defined". I suspect it doesn't matter much, because the point of the idiom is that people don't care about the value.
The problem for me is specifically with variables *other* than `_` - precisely because `_` has connotations of "don't care". If I see match expr: case Point(x, x): # what is x here? I would very strongly expect that to mean that the two components of Point were equal, and x was set to the common value. That's not what Python does, so this would be a fairly easy mistake to make. For what it's worth, it looks like Rust uses the same rule as the PEP - multiple occurrences of the same variable are not allowed, but _ is a wildcard that *can* be used multiple times, but isn't bound. Paul
Hey Larry, just to clarify on a single point you make: On Sun, 12 Jul 2020 at 10:48, Larry Hastings <larry@hastings.org> wrote:
[ snip ] To address 2), bind '_' when it's used as a name in a pattern.
This adds an extra reference and an extra store. That by itself seems harmless.
This is not always just a store. for patterns like `[a, *_, b]` vs `[a, *ignore_me, b]`, the current semantics mean that the matching process has to make 2 calls to `__getitem__` on the match subject. The second case (which would be equivalent to "remove special meaning on _") will have to actually create a new list and copy most of the original one which can be arbitrarily long, so this turns an O(1) operation into O(n).
The existing implementation has optimizations here. If that's important, we could achieve the same result with a little dataflow analysis to optimize away the dead store. We could even special-case optimizing away dead stores *only* to '_' and *only* in match/case statements and all would be forgiven.
This might work, although it's quite different to what python does in general (are you supposed to see the value of `_` in a debugger? or in `locals()`? ) Cheers, D.
On 7/12/20 9:04 AM, Daniel Moisset wrote:
The existing implementation has optimizations here. If that's important, we could achieve the same result with a little dataflow analysis to optimize away the dead store. We could even special-case optimizing away dead stores /only/ to '_' and /only/ in match/case statements and all would be forgiven.
This might work, although it's quite different to what python does in general (are you supposed to see the value of `_` in a debugger? or in `locals()`? )
All excellent points. The debugger question is easier to answer. Debuggers for compiled code have dealt with this for years; I'm unsure of the exact wording but gdb prints something like "<value optimized out>". As for locals(), my first thought was "suppress the optimization in the presence of a locals() call". I dimly recall a precedent where the presence of locals() in a function body affected code generation, though sadly it escapes me at the moment**. Anyway, that seems like a nasty hack, and it only handles one method of extracting a locals dict--there's several more, including sys._getframe and inspect.getframeinfo. And then the user could rebind those and we wouldn't notice. This seems like a non-starter. Having thought about it some, I propose it'd be acceptable to do dead store optimization if-and-only-if optimizations are explicitly enabled, e.g. with "-O". Allowing explicitly-enabled optimizations to observably affect runtime behavior does have some precedent, e.g. "-OO" which breaks doctest, docopt, etc. It'd be a shame if the existence of locals() et al meant Python could never ever perform dead store optimization. Your other (elided) point is correct too, about sequence matching for a sequence we don't care about not being as cheap and simple as a store and an extra reference. Cheers, //arry/ ** Or maybe I'm confused and thinking of something else entirely. Maybe it was "import * inside a function body disables fast locals in Python 2"? But that doesn't seem to be true either.
On Sun, Jul 12, 2020 at 12:12 PM Larry Hastings <larry@hastings.org> wrote:
Having thought about it some, I propose it'd be acceptable to do dead store optimization if-and-only-if optimizations are explicitly enabled, e.g. with "-O". Allowing explicitly-enabled optimizations to observably affect runtime behavior does have some precedent, e.g. "-OO" which breaks doctest, docopt, etc. It'd be a shame if the existence of locals() et al meant Python could never ever perform dead store optimization.
Assuming you're still talking about how to implement wildcards, it really sounds like you're willing to add a lot of complexity just to have a "consistent" treatment of `_`. But why would you care so much about that consistency? When I write `for x, _, _ in pts` the main point is not that I can write `print(_)` and get the z coordinate. The main point is that I am not interested in the y or the z coordinates (and showing this to the reader up front). The value assigned to `_` is uninteresting (even in a debug session, unless you're debugging Python itself). Using the same character in patterns makes intuitive sense to anyone who is familiar with this convention in Python. Furthermore it also makes sense to anyone who is familiar with patterns in other languages: *all* languages with structural pattern matching that we found at uses `_` -- C#, Elixir, Erlang, Scala, Rust, F#, Haskell, Mathematica, OCaml, Ruby, and Swift. (That's a much stronger precedent than the use of `?` in shell and regular expressions IMO. :-) The need for a wildcard pattern has already been explained -- we really want to disallow `Point(x, y, y)` but we really need to allow `Point(z, _, _)`. Generating code to assign the value to `_` seems odd given the clear intent to *ignore* the value. Using `?` as the wildcard has mostly disadvantages: it requires changes to the tokenizer, it could conflict with other future uses of `?` (it's been proposed for type annotations as a shorter version of Optional, and there's PEP 505, which I think isn't quite dead yet), and Python users have no pre-existing intuition for its meaning. A note about i18n: it would be unfortunate if we had to teach users they couldn't use `_` as a wildcard in patterns in code that also uses `_` as part of the i18n stack (`from gettext import gettext as _` -- see gettext stdlib docs). This is a known limitation on the `for x, _, _ in ...` idiom, which I've seen people work around by writing things like `for x, __, __ in ...`. But for patterns (because the pattern code generation needs to know about wildcards) we can't easily use that workaround. However, the solution of never assigning to `_` (by definition, rather than through dead store optimization) solves this case as well. So can we please lay this one to rest? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 2020-07-12 23:20, Guido van Rossum wrote: [snip]
The need for a wildcard pattern has already been explained -- we really want to disallow `Point(x, y, y)` but we really need to allow `Point(z, _, _)`. Generating code to assign the value to `_` seems odd given the clear intent to *ignore* the value.
Using `?` as the wildcard has mostly disadvantages: it requires changes to the tokenizer, it could conflict with other future uses of `?` (it's been proposed for type annotations as a shorter version of Optional, and there's PEP 505, which I think isn't quite dead yet), and Python users have no pre-existing intuition for its meaning.
FWIW, I don't think this use of '?' would conflict with the other suggested uses because this use would be initial in an expression, whereas the other uses would be non-initial. [snip]
Guido van Rossum writes:
[several reasons why not binding _ is a no-op<wink/>, and]
A note about i18n: [...]. So can we please lay this one to rest?
Yes, please! I was just about to ask about that. I could not believe that in July 2020 people were ignoring I18N, especially the fact that I18N workers are mostly language specialists, not programmers, and are quite dependent on programmers and UI/UX workers following "the usual conventions". Yes, mostly the message catalogs are produced by software that can be taught to understand other conventions. But often translators *do* need to read source to get context for the (usually) English "key" into the message catalogs. Steve
On 13/07/2020 00:20, Guido van Rossum wrote:
The need for a wildcard pattern has already been explained -- we really want to disallow `Point(x, y, y)` but we really need to allow `Point(z, _, _)`. Generating code to assign the value to `_` seems odd given the clear intent to *ignore* the value.
Would it be impossible for the parser to interpret Point(x, y, y) as "the second and third arguments of Point must have the same value in order to match. Bind that value to y"? Since the value has to be the same, it doesn't matter whether y binds to the first or the second (or the nth) instance of it. That said, in time and with all the arguments brought to the table, I personally came to accept special-casing _, although I don't especially like it. From my point of view, the biggest issue to solve is the load vs. store decision and its syntax.
On Mon, 13 Jul 2020 at 09:30, Federico Salerno <salernof11@gmail.com> wrote:
On 13/07/2020 00:20, Guido van Rossum wrote:
The need for a wildcard pattern has already been explained -- we really want to disallow `Point(x, y, y)` but we really need to allow `Point(z, _, _)`. Generating code to assign the value to `_` seems odd given the clear intent to *ignore* the value.
Would it be impossible for the parser to interpret Point(x, y, y) as "the second and third arguments of Point must have the same value in order to match. Bind that value to y"? Since the value has to be the same, it doesn't matter whether y binds to the first or the second (or the nth) instance of it.
No, that would not be impossible but fraught with problems. This is discussed in the PEP: https://www.python.org/dev/peps/pep-0622/#algebraic-matching-of-repeated-nam...
On 13/07/2020 12:28, Henk-Jaap Wagenaar wrote:
No, that would not be impossible but fraught with problems. This is discussed in the PEP: https://www.python.org/dev/peps/pep-0622/#algebraic-matching-of-repeated-nam...
The PEP goes into no detail of what these problems (or "number of subtleties") are, but it does mention how
we decided to make repeated use of names within the same pattern an error; we can always relax this restriction later without affecting backwards compatibility which does hint at the fact that the problems are not insurmountable hurdles.
All in all not an urgent feature, but it would be nice to have from the get-go if there is agreement and it is doable. @PEP authors: Incidentally, I am eager to start contributing and see this PEP advance—if there's anything I can do, including possibly non-code "dirty work" please let me know. I was thinking of collecting all the objections and points of contention of the PEP and the current progress on each in one place so that the pros and cons can be evaluated more clearly. Would that be useful?
On Mon, Jul 13, 2020 at 10:07 Federico Salerno <salernof11@gmail.com> wrote:
On 13/07/2020 12:28, Henk-Jaap Wagenaar wrote:
No, that would not be impossible but fraught with problems. This is discussed in the PEP:
https://www.python.org/dev/peps/pep-0622/#algebraic-matching-of-repeated-nam...
The PEP goes into no detail of what these problems (or "number of subtleties") are, but it does mention how
we decided to make repeated use of names within the same pattern an error; we can always relax this restriction later without affecting backwards compatibility
which does hint at the fact that the problems are not insurmountable hurdles.
All in all not an urgent feature, but it would be nice to have from the get-go if there is agreement and it is doable.
I find it debatable that we should have this at all, since there are other interpretations possible, and honestly I doubt that it’s a common use case. That’s why we’re holding off.
@PEP authors: Incidentally, I am eager to start contributing and see this PEP advance—if there's anything I can do, including possibly non-code "dirty work" please let me know. I was thinking of collecting all the objections and points of contention of the PEP and the current progress on each in one place so that the pros and cons can be evaluated more clearly. Would that be useful?
Thanks for offering, but we’ve got that under control. —Guido
-- --Guido (mobile)
On 13/07/2020 19:17, Guido van Rossum wrote:
I find it debatable that we should have this at all, since there are other interpretations possible, and honestly I doubt that it’s a common use case. That’s why we’re holding off.
Fair enough. All it would do would be save code in a guard condition after all.
Thanks for offering, but we’ve got that under control. I wasn't trying to imply otherwise. :)
On 12/07/2020 23:20, Guido van Rossum wrote:
So can we please lay this one to rest?
Sure. One small thing before we leave it; I've decided I don't care about the special cases of not using _. to lead class names, but forbidding **_ in mapping patterns seems unnecessary. I know it's redundant, but I can imagine using it for emphasis. I can't think of anywhere else the language forbids something just because it isn't needed, though I didn't get a lot of sleep last night and I could well be missing something obvious :-) Can I use pattern matching to pull byte strings apart? I thought I could, but trying it out in the playground didn't work at all. :-( -- Rhodri James *-* Kynesim Ltd
On Mon, Jul 13, 2020 at 04:35 Rhodri James <rhodri@kynesim.co.uk> wrote:
On 12/07/2020 23:20, Guido van Rossum wrote:
So can we please lay this one to rest?
Sure. One small thing before we leave it; I've decided I don't care about the special cases of not using _. to lead class names, but forbidding **_ in mapping patterns seems unnecessary. I know it's redundant, but I can imagine using it for emphasis. I can't think of anywhere else the language forbids something just because it isn't needed, though I didn't get a lot of sleep last night and I could well be missing something obvious :-)
I’d rather not. And the argument about disallowing obviously redundant syntax seems weak. My worry about allowing this is that it’ll be cargo culled and we’ll see it used not for emphasis (of what? The obvious?) but because people think it’s needed. And that’s just clutter. Can I use pattern matching to pull byte strings apart? I thought I
could, but trying it out in the playground didn't work at all. :-(
It’s explicitly forbidden by the PEP, because we don’t want str or bytes to accidentally match sequence patterns. You could do ‘match list(b):’ if you really wanted to, but I think there are better tools for parsing bytes or strings. —Guido -- --Guido (mobile)
On 13/07/2020 15:33, Guido van Rossum wrote:
On Mon, Jul 13, 2020 at 04:35 Rhodri James <rhodri@kynesim.co.uk> wrote:
[Re: forbidding **_ in mapping patterns]
I’d rather not. And the argument about disallowing obviously redundant syntax seems weak. My worry about allowing this is that it’ll be cargo culled and we’ll see it used not for emphasis (of what? The obvious?) but because people think it’s needed. And that’s just clutter.
Fair enough. I'd likely use it to remind myself of cases when there will always be more keys in the mapping, but a comment will do just as well.
Can I use pattern matching to pull byte strings apart? I thought I could, but trying it out in the playground didn't work at all. :-(
It’s explicitly forbidden by the PEP, because we don’t want str or bytes to accidentally match sequence patterns. You could do ‘match list(b):’ if you really wanted to, but I think there are better tools for parsing bytes or strings.
:-( As an embedded engineer, pulling apart network protocols was the first use I thought of for matching. Ah well. -- Rhodri James *-* Kynesim Ltd
On 7/12/20 3:20 PM, Guido van Rossum wrote:
On Sun, Jul 12, 2020 at 12:12 PM Larry Hastings <larry@hastings.org <mailto:larry@hastings.org>> wrote:
Having thought about it some, I propose it'd be acceptable to do dead store optimization if-and-only-if optimizations are explicitly enabled, e.g. with "-O". Allowing explicitly-enabled optimizations to observably affect runtime behavior does have some precedent, e.g. "-OO" which breaks doctest, docopt, etc. It'd be a shame if the existence of locals() et al meant Python could never ever perform dead store optimization.
Assuming you're still talking about how to implement wildcards, it really sounds like you're willing to add a lot of complexity just to have a "consistent" treatment of `_`. But why would you care so much about that consistency?
I'm a fan of the Zen guidance here: "special cases aren't special enough to break the rules". More on this topic in a moment--rather than reorder paragraphs, let me return to this below.
Using the same character in patterns makes intuitive sense to anyone who is familiar with this convention in Python. Furthermore it also makes sense to anyone who is familiar with patterns in other languages: *all* languages with structural pattern matching that we found at uses `_` -- C#, Elixir, Erlang, Scala, Rust, F#, Haskell, Mathematica, OCaml, Ruby, and Swift. (That's a much stronger precedent than the use of `?` in shell and regular expressions IMO. :-)
Python hasn't been afraid to go its own way syntactically in the past. Consider the conditional (ternary) operator. Most languages I've encountered with a conditional operator just copy C's syntax, with '?' and ':' (PHP, C#, Java). Some languages don't need a conditional operator, as their existing flow control already works just fine (FORTH, Rust). Python's syntax for the conditional operator was neither, and was AFAIK unique--but this syntax was judged the most Pythonic, so it won. Similarly, AFAIK Python's "None" is unique. Most other languages I've seen use the word "null", albeit with varying capitalization. So I'm unconcerned about Python using a different token for the wildcard pattern. Python already doesn't look like other languages, Python's proposed syntax for pattern matching isn't exactly like the pattern matching syntax of other languages. I don't understand why it's so important that it look like other languages in this one specific respect. As for leveraging the convention of using '_' for values you don't care about in Python--that's actually why I /don't/ like it as the wildcard pattern. To date, everyone who uses '_' understands it's just an identifier, no different from any other identifier. I imagine I18N programmers avoid this convention for exactly that reason--there's nothing special about '_', so they need to take care to not overwrite or occlude it with a don't-care value. However, if I understand PEP 622 correctly, the places you use '_' as the wildcard pattern are also places where you could put an identifier. But in this one context, '_' doesn't behave like the other identifiers, even though in every other context in Python it still does. This is the "special case" that "breaks the rules" I alluded to above. Consistency with the longstanding semantics of '_', and consistency with other identifiers, is much more important to me than consistency with other languages for the pattern matching wildcard token.
Using `?` as the wildcard has mostly disadvantages: it requires changes to the tokenizer, it could conflict with other future uses of `?` (it's been proposed for type annotations as a shorter version of Optional, and there's PEP 505, which I think isn't quite dead yet), and Python users have no pre-existing intuition for its meaning.
One reason I prefer '?' for the wildcard pattern is precisely /because/ users have no pre-existing intuition as to its meaning. Unlike '_', the user would have no preconceived notion about its semantics to unlearn. Also, it doesn't behave like an identifier, and accordingly it doesn't look like an identifier. This strikes me as harmonious. Is changing the tokenizer to support '?' as a token a big deal? You mention two other existing proposals to use it as a token--surely this is a bridge we'll have to cross sooner or later. My goal in starting this discussion was to see if we could find a compromise everyone could live with. People who want to use '_' for wildcard pattern could do so, people who didn't like '_' having a special meaning in this one context would be appeased. The message I'm getting is "this compromise won't work". Okay, fair enough. I don't plan to pursue it any further. Cheers, //arry/
Larry Hastings wrote:
As for leveraging the convention of using '_' for values you don't care about in Python--that's actually why I /don't/ like it as the wildcard pattern. To date, everyone who uses '_' understands it's just an identifier, no different from any other identifier.
Not quite... I understand it more like a file in /tmp I don't use it for anything I will want later, just in case.
However, if I understand PEP 622 correctly, the places you use '_' as the wildcard pattern are also places where you could put an identifier. But in this one context, '_' doesn't behave like the other identifiers, even though in every other context in Python it still does. This is the "special case" that "breaks the rules" I alluded to above. Consistency with the longstanding semantics of '_', and consistency with other identifiers, is much more important to me than consistency with other languages for the pattern matching wildcard token.
If a normal variable name is re-used, I would expect it to have the same meaning. I know that "case x, x:" as shorthand for "case x, __x if x == __x:" has been postponed, but it could still happen later, and it would be a problem if that ever became legal without requiring the two bindings to match. I do NOT assume that they will match if the variable happens to be _, though I suppose others might. -jJ
On 07/14/2020 09:22 AM, Jim J. Jewett wrote:
Larry Hastings wrote:
As for leveraging the convention of using '_' for values you don't care about in Python--that's actually why I /don't/ like it as the wildcard pattern. To date, everyone who uses '_' understands it's just an identifier, no different from any other identifier.
However, if I understand PEP 622 correctly, the places you use '_' as the wildcard pattern are also places where you could put an identifier. But in this one context, '_' doesn't behave like the other identifiers, even though in every other context in Python it still does. This is the "special case" that "breaks the rules" I alluded to above. Consistency with the longstanding semantics of '_', and consistency with other identifiers, is much more important to me than consistency with other languages for the pattern matching wildcard token.
Looking at other languages for inspiration is great, but like Larry I think we should make sure our constructs fit with Python, not with them.
I know that "case x, x:" as shorthand for "case x, __x if x == __x:" has been postponed, but it could still happen later, and it would be a problem if that ever became legal without requiring the two bindings to match. I do NOT assume that they will match if the variable happens to be _, though I suppose others might.
If we use `?` instead of `_`, then repeated `?` won't be a problem, and repeated `_` should be disallowed. Since `_` is a normal variable name, the requirement for their values to match (when that is finally implemented) would make sense, and shouldn't be a burden to remember given that that the "don't care" symbol is a `?`. -- ~Ethan~
On 13/07/20 7:12 am, Larry Hastings wrote:
I dimly recall a precedent where the presence of locals() in a function body affected code generation,
The presence of exec used to do that, which is why it was a statement rather than a function. But I don't think locals() ever did -- how would the compiler know that it was calling the builtin locals function and not something else? -- Greg
On Sun, Jul 12, 2020 at 7:36 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
On 13/07/20 7:12 am, Larry Hastings wrote:
I dimly recall a precedent where the presence of locals() in a function body affected code generation,
The presence of exec used to do that, which is why it was a statement rather than a function. But I don't think locals() ever did -- how would the compiler know that it was calling the builtin locals function and not something else?
super() does something similar:
class A: ... def super_locals(self): ... super ... return locals() ... def superless_locals(self): ... return locals() ... A().super_locals() {'self': <__main__.A object at 0x000001FF53BCE6D8>, '__class__': <class '__main__.A'>} A().superless_locals() {'self': <__main__.A object at 0x000001FF53BCE7B8>}
The compiler changes what local variables exist if there is a read from a variable named 'super', in order to support zero-argument super() calls. It presumably could do the same sort of thing for locals(). I don't think this is a good idea, since locals() is a debugging tool, and changing reality based on the presence of debugging calls may make life more difficult for the user. -- Devin
On 13/07/20 3:28 pm, Devin Jeanpierre wrote:
The compiler changes what local variables exist if there is a read from a variable named 'super',
That's fairly harmless if there's a false positive. But accidentally getting all your locals de-optimised would be annoying. -- Greg
Hello everyone, I'm sorry if my proposition has already being said, or even withdrawn, but I think that capture variables shouldn't be as implicit as they are now. I didn't see any mention of capture variable patterns in the rejected ideas. So here is my idea: I've looked at the PEP very quickly, jumping on the examples to have a taste and an idea of what was going here. I saw a new kind of control structure based on structural pattern matching (pattern based on classes or compositions of classes to make it short). A very good idea, emphasized by Tobias Kohn ("Another take on PEP 622") is that pattern matching eases the writting of code based on matching such structures, and that capturing values stored inside of these structures at the match time really eases the writting of the code associated with this match. But... looking at the examples, it wasn't very obvious that some variables were catching variables and some others were matching ones. I then read in details some rules about how to discover what is a captured variable. But I'm not sure everybody will do this... Zen of Python tells us that "explicit is better than implicit". I know this is not a rule written in the stone, but I think here, it applies very well. Guido said :
We’re really looking for a solution that tells you when you’re looking at an individual case which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. $x or x?) or other markup (e.g. backticks or <x>) makes this common case ugly and inconsistent: it’s unpleasant to see for example
case %x, %y: print(x, y)
Guido talk about a "sigil", which seems to be a meaningless mark only here to help the parser understand what the dev was writing. I propose that this "sigil" be the affectation mark : "=". Look : z = 42 match pt: case x=, y=, z: print(x, y, "z == 42") Or this one : def make_point_3d(pt): match pt: case (x=, y=): return Point3d(x, y, 0) case (x=, y=, z=): return Point3d(x, y, z) case Point2d(x=, y=): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support") On the need to be explicit: Simple case blocks will perhaps be a bit longer to write, but will not be harder to read, since they stay in the "simple case blocks" family. More complex cases will be harder to write, but the explicit markup will help understand what will be captured and where, and what will be looked-and-matched, using already known rules : looked-and-matched expressions will be computed as usual, then compared with the match term, and captured expression will be computed to a l-value (which is much more restrictive than random expressions). Moreover, explicilty introducing a difference between "capture" and "look-and-match" will help newcomers to understand what is the point about a match without they have to look at a PEP or other normative document. Remember that code has to be readable, because it will be read much more often than written. The reader has to understand quickly but not in details what will happen. Being explicit removes they the task to concentrate on this point. Also remember that Python has to be teached, and that all that is implicit in the code have to be explicited when teaching. And the longer you teach microdetails like what is the difference between "capture" vs "look-and-match", the less your audience will be prone to listen to you. On the drawback to be explicit: Adding a mark to every captured variables can and will be annoying. More annoying than not adding it. It's obvious. But we don't expect to have more than a handful of captured variables per case. Or the case is perhaps too complex, not to say perhaps too complicated. Using a carrefully choosen mark can alleviate this drawback. Using an already accepted and commonly used symbol will surely help. I know that Python will diverge from other languages on this point. Yes, but Python already diverged from others languages, and it is see by the community that it is for the better. Ex : the conditional expression aka the "ternary operator" aka "x = blabla if plop else bla". And I'll be a bit confused to have to explain that "captured variables" look like simple expressions "but are not" because that's how things are written in other functionnal languages. I'm not sure it will convince anybody that aren't already familiar with pattern matching in functionnal languages (which is a big topic by itself). On the syntax: Using the "=" character is well know, easy to find on a keyboard and already hold the semantics of "putting a value in a variable". So this mark is not a random sigil, but the way we write affectations, with a l-value, a "=" character, and a r-value. The r-value is given from the match and is omited here. And even if only a few ones know what "l-value" means, everybody knows what is allowed to put on the left side of an "=". Morever, this kind of syntax already exists in the Python world, when using default function parameters values : def make_point(x=0, y=0): return x, y Here the r-value is present. The l-value still has a defined semantic that is easy to learn and understand without requiring to read the Python semantics book or any PEPs. And this is still the same semantic of "putting a value in a variable". And see how there is "x=" in the definition, but just "x" in the body of the function. Like in the case block. That's why I think it will add value to be explicit about captured variable, and that choosing a meaningfull mark can clarify many implicit, and hard to understand, patterns. Emmanuel
On Fri, 17 Jul 2020 at 12:26, <emmanuel.coirier@caissedesdepots.fr> wrote:
Hello everyone,
I'm sorry if my proposition has already being said, or even withdrawn, but I think that capture variables shouldn't be as implicit as they are now. I didn't see any mention of capture variable patterns in the rejected ideas. So here is my idea:
I've looked at the PEP very quickly, jumping on the examples to have a taste and an idea of what was going here. I saw a new kind of control structure based on structural pattern matching (pattern based on classes or compositions of classes to make it short). A very good idea, emphasized by Tobias Kohn ("Another take on PEP 622") is that pattern matching eases the writting of code based on matching such structures, and that capturing values stored inside of these structures at the match time really eases the writting of the code associated with this match.
But... looking at the examples, it wasn't very obvious that some variables were catching variables and some others were matching ones. I then read in details some rules about how to discover what is a captured variable. But I'm not sure everybody will do this...
Zen of Python tells us that "explicit is better than implicit". I know this is not a rule written in the stone, but I think here, it applies very well.
Guido said :
We’re really looking for a solution that tells you when you’re looking at an individual case which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. $x or x?) or other markup (e.g. backticks or <x>) makes this common case ugly and inconsistent: it’s unpleasant to see for example
case %x, %y: print(x, y)
Guido talk about a "sigil", which seems to be a meaningless mark only here to help the parser understand what the dev was writing.
I propose that this "sigil" be the affectation mark : "=". Look :
z = 42 match pt: case x=, y=, z: print(x, y, "z == 42")
Or this one :
def make_point_3d(pt): match pt: case (x=, y=): return Point3d(x, y, 0) case (x=, y=, z=): return Point3d(x, y, z) case Point2d(x=, y=): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support")
I kind of agree it is nicer to be more explicit. But somehow x= looks ugly. It occurred to me (and, again, apologies if already been mentioned), we might use the `as` keyword here. The example above would become: def make_point_3d(pt): match pt: case (as x, as y): return Point3d(x, y, 0) case (as x, as y, as z): return Point3d(x, y, z) case Point2d(as x, as y): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support") If having "as x" as a standalone expression without anything to the left of "as" causes confusion, we could instead mandate the use of _ thus: case (_ as x, _ as y): return Point3d(x, y, 0) On the need to be explicit:
Simple case blocks will perhaps be a bit longer to write, but will not be harder to read, since they stay in the "simple case blocks" family.
More complex cases will be harder to write, but the explicit markup will help understand what will be captured and where, and what will be looked-and-matched, using already known rules : looked-and-matched expressions will be computed as usual, then compared with the match term, and captured expression will be computed to a l-value (which is much more restrictive than random expressions).
Moreover, explicilty introducing a difference between "capture" and "look-and-match" will help newcomers to understand what is the point about a match without they have to look at a PEP or other normative document.
Remember that code has to be readable, because it will be read much more often than written. The reader has to understand quickly but not in details what will happen. Being explicit removes they the task to concentrate on this point.
Also remember that Python has to be teached, and that all that is implicit in the code have to be explicited when teaching. And the longer you teach microdetails like what is the difference between "capture" vs "look-and-match", the less your audience will be prone to listen to you.
On the drawback to be explicit:
Adding a mark to every captured variables can and will be annoying. More annoying than not adding it. It's obvious. But we don't expect to have more than a handful of captured variables per case. Or the case is perhaps too complex, not to say perhaps too complicated.
Using a carrefully choosen mark can alleviate this drawback. Using an already accepted and commonly used symbol will surely help.
I know that Python will diverge from other languages on this point. Yes, but Python already diverged from others languages, and it is see by the community that it is for the better. Ex : the conditional expression aka the "ternary operator" aka "x = blabla if plop else bla".
And I'll be a bit confused to have to explain that "captured variables" look like simple expressions "but are not" because that's how things are written in other functionnal languages. I'm not sure it will convince anybody that aren't already familiar with pattern matching in functionnal languages (which is a big topic by itself).
On the syntax:
Using the "=" character is well know, easy to find on a keyboard and already hold the semantics of "putting a value in a variable".
So this mark is not a random sigil, but the way we write affectations, with a l-value, a "=" character, and a r-value. The r-value is given from the match and is omited here.
And even if only a few ones know what "l-value" means, everybody knows what is allowed to put on the left side of an "=".
Morever, this kind of syntax already exists in the Python world, when using default function parameters values :
def make_point(x=0, y=0): return x, y
Here the r-value is present. The l-value still has a defined semantic that is easy to learn and understand without requiring to read the Python semantics book or any PEPs. And this is still the same semantic of "putting a value in a variable".
And see how there is "x=" in the definition, but just "x" in the body of the function. Like in the case block.
That's why I think it will add value to be explicit about captured variable, and that choosing a meaningfull mark can clarify many implicit, and hard to understand, patterns.
Emmanuel _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RDEKWUZ6... Code of Conduct: http://python.org/psf/codeofconduct/
-- Gustavo J. A. M. Carneiro Gambit Research "The universe is always one step beyond logic." -- Frank Herbert
On 7/17/20 8:26 AM, Gustavo Carneiro wrote:
I kind of agree it is nicer to be more explicit. But somehow x= looks ugly. It occurred to me (and, again, apologies if already been mentioned), we might use the `as` keyword here.
The problem with any kind of sigil/keyword is that it becomes line noise -- we would have to train ourselves to ignore them in order to see the structure and variables we are actually interested in. Once we become adept at ignoring them, we will again have difficulties when debugging as we won't easily see them. Besides which, the problem is solved: - namespace.var is a lookup - var is a store -- ~Ethan~
On Fri, 17 Jul 2020, Ethan Furman wrote:
The problem with any kind of sigil/keyword is that it becomes line noise -- we would have to train ourselves to ignore them in order to see the structure and variables we are actually interested in. Once we become adept at ignoring them, we will again have difficulties when debugging as we won't easily see them.
Besides which, the problem is solved:
- namespace.var is a lookup - var is a store
Regardless how hard this is being pushed, I beg to disagree. Matching is easy to think of as an extension of destructuring assignment, and could easily be that, if we just don't introduce incompatible rules. Everything currently allowed in an assignment is a store, so it follows that the markup for lookups must be something that is not currently allowed in an assignment. Besides literal constants, my preference is for ``` == x ```, with the obvious opening for future extension. /Paul
Ethan Furman wrote:
The problem with any kind of sigil/keyword is that it becomes line noise -- we would have to train ourselves to ignore them in order to see the structure and variables we are actually interested in. Once we become
Every syntax element can become noise once we're used to it. This is how Groovy is built from Java : they removed everything that can be removed, and still be "understandable" by the compiler. The result is the language being counter intuitive for people that don't do Groovy everyday... Can I write thing like this ? Seems to work... And with that ? Works too, but I don't know if it produces the same effect... We can also think about syntax elements as strucural elements like pillars, helping the thought to elaborate while reading the code. Pillars are constraints for people in a building (they block your view, you have to bypass them, ...), but they helps building bigger constructions, and we're all used to them. In this slightly modified example from the PEP : match entree[-1]: case Sides.SPAM: response = "Have you got anything without Spam?" case "letuce": response = "blablabla" case side: response = f"Well, could I have their Spam instead of the {side} then?" case 1542 | "plop": response = "blablabla2" It's difficult for someone not mastering this feature to see immediatly that "side" will get it's value changed and that the last case will never match. match entree[-1]: case Sides.SPAM: response = "Have you got anything without Spam?" case "letuce": response = "blablabla" case side=: response = f"Well, could I have their Spam instead of the {side} then?" case 1542 | "plop": response = "blablabla2" Here we immediatly see that the first two cases don't work in the same way as the third, because there is "something more". It will even maybe indicate that the last case is useless...
adept at ignoring them, we will again have difficulties when debugging as we won't easily see them. Besides which, the problem is solved:
namespace.var is a lookup var is a store
These rules can't be deduced from a few examples, or from experience from other languages. You have to explicitly learn them. Since newcomers won't propably learn them first (if at all), they won't understand how it works, and they will propably introduce bugs hard to debug. They'll think it's a kind of "swith case" new construct, and will use it that way, completly ignoring the "capturing" property that shows in some cases and not in others. match entree[-1]: case Sides.SPAM: # newcomers will understand that entree[-1] == Sides.SPAM and write the code they need SPAM = "spam" match entree[-1]: case SPAM: # newcomers will understand that entree[-1] == "spam" and write the code they need # ignoring that now, in the following code, SPAM == anything but "spam" # introducing a bug anywhere in the following code where SPAM is expected to hold the # initial value Only a unit test case that test that SPAM has changed can detect this kind of bug. Generally speaking, unit test cases don't test values of "constants" before and after a test case. So it won't even help. Here, we can argue that match is not a "switch case" like syntax, but newcomers from C, Java, Javascript, whatever WILL use it like a "switch case", and WILL read code where it will be used like that. Even if it's not the main use case, it will be used for that, because of 50 years of history of C that we can't ignore. Adding a "=" or something else will at least ring a bell. We can argue that constants should be namespaced, but is it a general way of doing ? People can write "from module import SPAM" or "import module; module.SPAM". This is equivalent, but in one case, it may introduce a bug. Do not forget that Python will be used by many more newcomers, new learners, new developers, data scientists, people with unknow backgrounds, and perhaps few, or no experience in programming. IMHO Python strength is that it's syntax is easy to learn because it is easy to deduce. The some rules that are counter-intuitive like the "else" clause for the loops can't cause any harm if misused because their misuse is detected immediatly, and we can avoid writing them (and most people do). On the other hand, "capturing" variables mixed with "match" variables is counter-intuitive unless you explicitly learn the rules. You can't deduce it (there rules don't exist anywhere else). This feature is central of the PEP and will be used, and will introduce subtle bugs when misused. That's why I consider the rules you stated is not the right way for this feature, and that we should be explicit.
On Sat, Jul 18, 2020 at 3:29 AM <emmanuel.coirier@caissedesdepots.fr> wrote:
adept at ignoring them, we will again have difficulties when debugging as we won't easily see them. Besides which, the problem is solved:
namespace.var is a lookup var is a store
These rules can't be deduced from a few examples, or from experience from other languages. You have to explicitly learn them. Since newcomers won't propably learn them first (if at all), they won't understand how it works, and they will propably introduce bugs hard to debug. They'll think it's a kind of "swith case" new construct, and will use it that way, completly ignoring the "capturing" property that shows in some cases and not in others.
match entree[-1]: case Sides.SPAM: # newcomers will understand that entree[-1] == Sides.SPAM and write the code they need
SPAM = "spam" match entree[-1]: case SPAM: # newcomers will understand that entree[-1] == "spam" and write the code they need # ignoring that now, in the following code, SPAM == anything but "spam" # introducing a bug anywhere in the following code where SPAM is expected to hold the # initial value
If a constant's actually constant, as in SPAM: Final = "spam" then it'll throw an error. Likewise, the first time it does something totally unexpected like insert something into what they thought held a match pattern, it'll break their initial assumptions and hopefully get them to read the documentation, to form a more accurate mental model. <http://python.org/psf/codeofconduct/>As long as
namespace.var is a lookup var is a store
is big, bold, and front & center in the docs, I think everyone will catch on very quickly and wrap their vars in a class, even if they never use it for more than a glorified switch-case. Designing an entire feature around what someone who's never encountered it before thinks it might do doesn't seem useful, since anyone could bring any number of assumptions. -Em
Emily Bowman wrote:
SPAM: Final = "spam" then it'll throw an error. Likewise, the first time it does something totally unexpected like insert something into what they thought held a match pattern, it'll break their initial assumptions and hopefully get them to read the documentation, to form a more accurate mental model.
Currently, the following example is working without any error or warning on the current Guido's build. I'm aware that it is not the final version, but I didn't see anything for now guarding Final decorated values to be overwritten at runtime (either by an affectation or by a case clause). from typing import Final FIVE_VALUE: Final = 5 a = (7, 8) match a: case (FIVE_VALUE, 8): print("in five value clause") case _: print("in default clause") print(f"Value of FIVE_VALUE: {FIVE_VALUE}") But I concede overwriting names that are Final could at least throw some warnings.
As long as
namespace.var is a lookup var is a store is big, bold, and front & center in the docs, I think everyone will catch on very quickly and wrap their vars in a class, even if they never use it for more than a glorified switch-case.
My point is a bit deeper. I consider these rules a bit clumsy. I've undestood why they have been designed that way, but they didn't look pythonic. Like if scaffolding was still there.
Designing an entire feature around what someone who's never encountered it before thinks it might do doesn't seem useful, since anyone could bring any number of assumptions.
I'm sorry to disagree. Apple has built its brand on the fact that you didn't need the doc to succesfully use their products. I don't think that all features of the langage have to be that obvious, but the first look by some random dev should help them catch the thing, and avoid such a pitfall. -- Emmanuel
On 7/18/2020 6:23 AM, emmanuel.coirier@caissedesdepots.fr wrote:
Ethan Furman wrote:
The problem with any kind of sigil/keyword is that it becomes line noise -- we would have to train ourselves to ignore them in order to see the structure and variables we are actually interested in. Once we become
[snip much]
On the other hand, "capturing" variables mixed with "match" variables is counter-intuitive unless you explicitly learn the rules. You can't deduce it (there rules don't exist anywhere else). This feature is central of the PEP and will be used, and will introduce subtle bugs when misused.
That's why I consider the rules you stated is not the right way for this feature, and that we should be explicit.
It seems to me that whether one expects simple names in case headers to be sources or targets depends on how one analogizes the match code in case headers. If one sees it as analogous to imperative elif conditions, where names are value sources, then one likely expects that. If one sees match code as analogous to target or parameter lists, where names declare binding targets, then one like expect that behavior instead. Both analogies are inexact because match code needs to have both sources and targets. Different people will have different preferences and expectations. I happen to prefer the parameter list analogy because conditions are executable expressions while match code is not and by intention is partly to mostly declarative, with the implementation in logic and expressions left to the compiler. -- Terry Jan Reedy
On 18/07/2020 11:23, emmanuel.coirier@caissedesdepots.fr wrote:
Ethan Furman wrote:
The problem with any kind of sigil/keyword is that it becomes line noise -- we would have to train ourselves to ignore them in order to see the structure and variables we are actually interested in. Once we become Every syntax element can become noise once we're used to it. This is how Groovy is built from Java : they removed everything that can be removed, and still be "understandable" by the compiler. The result is the language being counter intuitive for people that don't do Groovy everyday... Can I write thing like this ? Seems to work... And with that ? Works too, but I don't know if it produces the same effect...
We can also think about syntax elements as strucural elements like pillars, helping the thought to elaborate while reading the code. Pillars are constraints for people in a building (they block your view, you have to bypass them, ...), but they helps building bigger constructions, and we're all used to them.
In this slightly modified example from the PEP :
match entree[-1]: case Sides.SPAM: response = "Have you got anything without Spam?" case "letuce": response = "blablabla" case side: response = f"Well, could I have their Spam instead of the {side} then?" case 1542 | "plop": response = "blablabla2"
It's difficult for someone not mastering this feature to see immediatly that "side" will get it's value changed and that the last case will never match.
+1
match entree[-1]: case Sides.SPAM: response = "Have you got anything without Spam?" case "letuce": response = "blablabla" case side=: response = f"Well, could I have their Spam instead of the {side} then?" case 1542 | "plop": response = "blablabla2"
Here we immediatly see that the first two cases don't work in the same way as the third, because there is "something more". It will even maybe indicate that the last case is useless...
adept at ignoring them, we will again have difficulties when debugging as we won't easily see them. Besides which, the problem is solved:
namespace.var is a lookup var is a store These rules can't be deduced from a few examples, or from experience from other languages. You have to explicitly learn them.
+1
Since newcomers won't propably learn them first (if at all), they won't understand how it works, and they will propably introduce bugs hard to debug. They'll think it's a kind of "swith case" new construct, and will use it that way, completly ignoring the "capturing" property that shows in some cases and not in others.
match entree[-1]: case Sides.SPAM: # newcomers will understand that entree[-1] == Sides.SPAM and write the code they need
SPAM = "spam" match entree[-1]: case SPAM: # newcomers will understand that entree[-1] == "spam" and write the code they need # ignoring that now, in the following code, SPAM == anything but "spam" # introducing a bug anywhere in the following code where SPAM is expected to hold the # initial value
Only a unit test case that test that SPAM has changed can detect this kind of bug. Generally speaking, unit test cases don't test values of "constants" before and after a test case. So it won't even help.
Here, we can argue that match is not a "switch case" like syntax, but newcomers from C, Java, Javascript, whatever WILL use it like a "switch case", and WILL read code where it will be used like that. Even if it's not the main use case, it will be used for that, because of 50 years of history of C that we can't ignore. Adding a "=" or something else will at least ring a bell.
We can argue that constants should be namespaced, but is it a general way of doing ? People can write "from module import SPAM" or "import module; module.SPAM". This is equivalent, but in one case, it may introduce a bug.
Do not forget that Python will be used by many more newcomers, new learners, new developers, data scientists, people with unknow backgrounds, and perhaps few, or no experience in programming. IMHO Python strength is that it's syntax is easy to learn because it is easy to deduce. The some rules that are counter-intuitive like the "else" clause for the loops can't cause any harm if misused because their misuse is detected immediatly, and we can avoid writing them (and most people do).
On the other hand, "capturing" variables mixed with "match" variables is counter-intuitive unless you explicitly learn the rules. You can't deduce it (there rules don't exist anywhere else). This feature is central of the PEP and will be used, and will introduce subtle bugs when misused.
That's why I consider the rules you stated is not the right way for this feature, and that we should be explicit. +1. Explicit is better than implicit. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HQLSHN4I... Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, 17 Jul 2020 at 12:28, Gustavo Carneiro <gjcarneiro@gmail.com> wrote:
On Fri, 17 Jul 2020 at 12:26, <emmanuel.coirier@caissedesdepots.fr> wrote:
Hello everyone, (...) But... looking at the examples, it wasn't very obvious that some variables were catching variables and some others were matching ones. I then read in details some rules about how to discover what is a captured variable. But I'm not sure everybody will do this...
Zen of Python tells us that "explicit is better than implicit". I know this is not a rule written in the stone, but I think here, it applies very well.
I Also dislike the idea of undotted names being assigned, with not extra visual clues, and the scenario described by Emmanuel, in the other e-mail about people changing variables when they think they are making a match (essentially introducing the _same_ problem that `if (SPAM = 0):` had in C which Python used to justify assignment not being an expression for over 20 years). So, in adding to the bikeshed color possibilities, alongside the "x=, y=" in this first e-mail or the "_ as x, _ as y" from Gustavo, I present the possibility of making the Walrus mandatory for capture. Maybe it is a bit "too much typing" - (Walrus will require 5-6 keystrokes with the surrounding spaces), but I think the final look can be pleasantly intuitive: match my_point: case (x := _, y := _) | Point2d(x := _, y := _): return Point3d(x, y, 0)
Guido said :
We’re really looking for a solution that tells you when you’re looking at an individual case which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. $x or x?) or other markup (e.g. backticks or <x>) makes this common case ugly and inconsistent: it’s unpleasant to see for example
case %x, %y: print(x, y)
Guido talk about a "sigil", which seems to be a meaningless mark only here to help the parser understand what the dev was writing.
I propose that this "sigil" be the affectation mark : "=". Look :
z = 42 match pt: case x=, y=, z: print(x, y, "z == 42")
(...) Gustavo Carneiro <gjcarneiro@gmail.com> wrote:
I kind of agree it is nicer to be more explicit. But somehow x= looks ugly. It occurred to me (and, again, apologies if already been mentioned), we might use the `as` keyword here.
The example above would become:
def make_point_3d(pt): match pt: case (as x, as y): return Point3d(x, y, 0) case (as x, as y, as z): return Point3d(x, y, z) case Point2d(as x, as y): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support")
If having "as x" as a standalone expression without anything to the left of "as" causes confusion, we could instead mandate the use of _ thus:
case (_ as x, _ as y): return Point3d(x, y, 0)
On 7/17/2020 7:23 AM, emmanuel.coirier@caissedesdepots.fr wrote:
Hello everyone,
I'm sorry if my proposition has already being said, or even withdrawn, but I think that capture variables shouldn't be as implicit as they are now.
I've looked at the PEP very quickly, jumping on the examples to have a taste and an idea of what was going here. I saw a new kind of control structure based on structural pattern matching (pattern based on classes or compositions of classes to make it short). A very good idea, emphasized by Tobias Kohn ("Another take on PEP 622") is that pattern matching eases the writting of code based on matching such structures, and that capturing values stored inside of these structures at the match time really eases the writting of the code associated with this match.
A major points of Kohn's post is that 'case' is analogous to 'def' and match lists are analogous to parameter lists. In parameter lists, untagged simple names ('parameter names') are binding targets. Therefore, untagged simple names in match lists, let us call them 'match names' should be also. I elaborated on this in my response to Tobias. -- Terry Jan Reedy
Terry Reedy wrote:
A major points of Kohn's post is that 'case' is analogous to 'def' and match lists are analogous to parameter lists. In parameter lists,
I'm sorry to disagree, but match lists share very few things in common with today's parameters list, and introduce a full new concept of "matching" vs "binding/capturing" that doesn't exists with the function definition.
untagged simple names ('parameter names') are binding targets. Therefore, untagged simple names in match lists, let us call them 'match names' should be also. I elaborated on this in my response to Tobias.
This approach, for me, seems to come from functionnal languages where pattern matching is a thing. The proposed "match" clause tends to mimic this approach, and it can be a good thing. But the Python's function definition has not been inspired by functionnal programming from the ground, and I think it would be an error to reason this way, because people not used to pattern matching in functionnal programming won't understand anything (imagine that comprehension lists are a big thing for many learners). That's why I think reasonning in such a theorical point of view will leads many python developpers to a dead end.
On Sat, Jul 18, 2020 at 09:25:45AM -0000, emmanuel.coirier@caissedesdepots.fr wrote:
This approach, for me, seems to come from functionnal languages where pattern matching is a thing. The proposed "match" clause tends to mimic this approach, and it can be a good thing. But the Python's function definition has not been inspired by functionnal programming from the ground, and I think it would be an error to reason this way, because people not used to pattern matching in functionnal programming won't understand anything (imagine that comprehension lists are a big thing for many learners).
It is true that beginners sometimes struggle a bit to grok comprehension syntax. I know I did. And yet, despite that, comprehensions have turned out to be one of the most powerful and popular features of Python, sometimes *too* popular. It is sometimes hard to convince both beginners and even experienced devs that comprehensions are not the only tool in their toolbox, and not every problem is a nail. You say: "people not used to pattern matching in functionnal programming won't understand anything" but people using Haskell weren't born knowing the language. They had to learn it. It's been sometimes said that functional programmers are smarter, elite programmers a level above the average OOP or procedural programmer, but that's mostly said by functional programmers :-) and I'm not entirely sure that its true. In any case, I don't think that any (actual or imaginary) gap between the ability of the average Haskell programmer and the average Python programmer is so great that we should dismiss pattern matching as beyond the grasp of Python coders. In any case, functional languages like Haskell, F# and ML are not the only languages with pattern matching. Non-FP languages like C#, Swift, Rust and Scala have it, and even Java has an extension providing pattern matching: http://tom.loria.fr/wiki/index.php/Main_Page -- Steven
Steven D'Aprano wrote: [...]
In any case, functional languages like Haskell, F# and ML are not the only languages with pattern matching. Non-FP languages like C#, Swift, Rust and Scala have it, and even Java has an extension providing pattern matching: http://tom.loria.fr/wiki/index.php/Main_Page
I'm not against pattern matching at all. I think it's a very nice feature, but that one of its behavior which is variable capturing should be made more explicit, following the rules of the Zen of Python.
It's been sometimes said that functional programmers are smarter, elite programmers a level above the average OOP or procedural programmer, but that's mostly said by functional programmers :-)
I would say there are fewer of them :-) -- Emmanuel
On Sat, 18 Jul 2020 at 14:12, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Jul 18, 2020 at 09:25:45AM -0000, emmanuel.coirier@caissedesdepots.fr wrote:
This approach, for me, seems to come from functionnal languages where pattern matching is a thing. The proposed "match" clause tends to mimic this approach, and it can be a good thing. But the Python's function definition has not been inspired by functionnal programming from the ground, and I think it would be an error to reason this way, because people not used to pattern matching in functionnal programming won't understand anything (imagine that comprehension lists are a big thing for many learners).
It is true that beginners sometimes struggle a bit to grok comprehension syntax. I know I did.
And yet, despite that, comprehensions have turned out to be one of the most powerful and popular features of Python, sometimes *too* popular. It is sometimes hard to convince both beginners and even experienced devs that comprehensions are not the only tool in their toolbox, and not every problem is a nail.
You say: "people not used to pattern matching in functionnal programming won't understand anything" but people using Haskell weren't born knowing the language. They had to learn it.
It's been sometimes said that functional programmers are smarter, elite programmers a level above the average OOP or procedural programmer, but that's mostly said by functional programmers :-) and I'm not entirely sure that its true. In any case, I don't think that any (actual or imaginary) gap between the ability of the average Haskell programmer and the average Python programmer is so great that we should dismiss pattern matching as beyond the grasp of Python coders.
In any case, functional languages like Haskell, F# and ML are not the only languages with pattern matching. Non-FP languages like C#, Swift, Rust and Scala have it, and even Java has an extension providing pattern matching:
You do a nice job arguing that matching is a nice feature to have - and I guess we are past this point. But I don't see one thing in the above characters pointing that using an undifferentiated name by itself in the match/case construct will be better than trying to find a way to differentiate it, and having significant gains in readability and "learnability" Yes, people aren't born knowing Haskell, but then, one of the strong points in Python is (or used to be) it _not looking_ like Haskell. Having a differentiation sign for assignment would also allow matching against values in variables just work in a very intuitive way, just like it would have happened with the ".variable_name" in the first version. (I've written another e-mail on the thread, but since this scatters around: my current bikeshed color is to _require_ the walrus op for assignments like in: `case (x := _, y := _): ...` ) js -><- http://tom.loria.fr/wiki/index.php/Main_Page
-- Steven
I'm very new to this mailing list so I'm not sure it's my place to email, but I'd like to weigh in and maybe it will be useful. If not you can always ignore ;) I think adding the Walrus operator is trying to solve a problem that doesn't exist. Compare the example from the PEP: def make_point_3d(pt): match pt: case (x, y): return Point3d(x, y, 0) case (x, y, z): return Point3d(x, y, z) case Point2d(x, y): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support") To the one without: def make_point_3d(pt): match pt: case (x := _, y := _): return Point3d(x, y, 0) case (x := _, y := _, z := _): return Point3d(x, y, z) case Point2d(x := _, y := _): return Point3d(x, y, 0) case Point3d(_, _, _): return pt case _: raise TypeError("not a point we support") It's a lot more typing, it's a lot more ugly, and I'd argue it's not any more explicit than the earlier one. We still have all the same variables, except now we have to follow them with a ritualistic ":= _" to capture them. Normally we use the underscore to discard or hide something (at least that's how I've always used it), and suddenly it is used when we want to keep the thing it stands for?! Also, I understand Python doesn't look like Haskell or Rust or whatever, but you also have people coming from those languages to Python, and people going to those languages from Python. Having a different syntax from what literally everybody else does will lead to a lot of confusion. I think the default option should be to have it like the current proposal (and everybody else), and update it only if there is a good reason to do so. "We don't want to look like the rest" should not be an argument. I think Python not looking like anything else is a result of the readability and simplicity goals of Python, not because the goal was to look different. Finally, I asked an actual Python newbie (our trainee) about his opinion, and he said he didn't think the walrus example was any more useful. Of course, N=1, not an experiment, doesn't measure mistakes in practice, etc. But let's make sure it's an actual problem before we go complicate the syntax. Again, first time mailing here and I don't know if it's my place (can I even mail into this list?), but I hope the perspective is of some use. Rik P.S. I never had issues with list comprehensions, because it's basically how you write down sets in mathematics (which is what I studied). "Joao S. O. Bueno" <jsbueno@python.org.br> wrote: “” “On Sat, 18 Jul 2020 at 14:12, Steven D'Aprano <steve@pearwood.info> wrote:” “On Sat, Jul 18, 2020 at 09:25:45AM -0000, emmanuel.coirier@caissedesdepots.fr wrote:
This approach, for me, seems to come from functionnal languages where
pattern matching is a thing. The proposed "match" clause tends to
mimic this approach, and it can be a good thing. But the Python's
function definition has not been inspired by functionnal programming
from the ground, and I think it would be an error to reason this way,
because people not used to pattern matching in functionnal programming
won't understand anything (imagine that comprehension lists are a big
thing for many learners).
It is true that beginners sometimes struggle a bit to grok comprehension syntax. I know I did. And yet, despite that, comprehensions have turned out to be one of the most powerful and popular features of Python, sometimes *too* popular. It is sometimes hard to convince both beginners and even experienced devs that comprehensions are not the only tool in their toolbox, and not every problem is a nail. You say: "people not used to pattern matching in functionnal programming won't understand anything" but people using Haskell weren't born knowing the language. They had to learn it. It's been sometimes said that functional programmers are smarter, elite programmers a level above the average OOP or procedural programmer, but that's mostly said by functional programmers :-) and I'm not entirely sure that its true. In any case, I don't think that any (actual or imaginary) gap between the ability of the average Haskell programmer and the average Python programmer is so great that we should dismiss pattern matching as beyond the grasp of Python coders. In any case, functional languages like Haskell, F# and ML are not the only languages with pattern matching. Non-FP languages like C#, Swift, Rust and Scala have it, and even Java has an extension providing pattern matching:” “ You do a nice job arguing that matching is a nice feature to have - and I guess we are past this point. But I don't see one thing in the above characters pointing thatusing an undifferentiated name by itself in the match/case constructwill be better than trying to find a way to differentiate it,and having significant gains in readability and "learnability" Yes, people aren't born knowing Haskell, but then, one of the strongpoints in Python is (or used to be) it _not looking_ like Haskell. Having a differentiation sign for assignment would also allow matching against values in variables just work in a very intuitive way,just like it would have happened with the ".variable_name" in the first version. (I've written another e-mail on the thread, but since this scatters around: my current bikeshed color is to _require_ the walrus op for assignments like in: `case (x := _, y := _): ...` ) js -><-” “http://tom.loria.fr/wiki/index.php/Main_Page -- Steven” [attachment.txt]
Welcome to python-dev, Rik! Of course you can email to this list. On 30/07/2020 14:30, Rik de Kort via Python-Dev wrote:
I think adding the Walrus operator is trying to solve a problem that doesn't exist. Compare the example from the PEP:
[snip]
case (x, y, z):
[snip]
To the one without:
[snip]
case (x := _, y := _, z := _): It's a lot more typing, it's a lot more ugly, and I'd argue it's not any more explicit than the earlier one.
The debate is still going on as to whether "capture" variables should be marked, and if so whether with the walrus operator or in some other way, and "var := _" wouldn't be my personal preference. However, case (x := _, y := _, z := _): *is* more explicit, because it explicitly says that x, y and z are variables that should capture (i.e. be bound to) whatever values are found. Contrast this with say case (x := _, y := _, z): which says that z contains a value to be *matched*, and if such a match is found, x and y should capture the relevant values. Best wishes Rob Cliffe
Hi Rob, thank you! :) I think I understand the point, but I still don't agree with it. I find it hard to come up with a concrete use case where you would like to name a parameter without specifying it. Suppose we want case Status(user, n_messages, replies, unicode:=_) Then it might be a little useful for us to type the non-captured arguments explicitly because it's easier to remember the signature that way. Alternatively, if you want to capture an arg like this and you have more than a few positional arguments, you should probably just match on a keyword argument (or refactor your code so your API's are simpler). Also, what would we type if we wanted to capture a keyword argument? Something like this? case Status(user, n_messages, replies, unicode=unicode:=_) Surely that must be a horrible joke! (N.B. I suppose this is an argument against this specific syntax rather than capturing) Additional potentials I came up with are checking for the number of arguments (when it's a lot, so typing all the underscores becomes hard to count), like: match pt: case (a, b, c, d, e, f, g, h): manage_len_8(pt) case (a, b, c, d, e, f, g, h, i, j, k): manage_len_11(pt) But in that case why not use an if-else, like so. if len(pt)==8: manage_len_8(pt) elif len(pt)==11: manage_len_11(pt) There must be use cases I haven't thought of, but I think they will all fall prey to similar objections as the above two. I'm open to being proven wrong, though! The thing about explicitness is, sure, it is better than implicitness. But beautiful is also better than ugly, and simple better than complex, etc. etc. I think they mean nothing without specific use cases to know what it actually means for this thing in this context. I think case(x:=_, y:=_, z) is exactly as explicit as case(x, y, _) (it names x and y explicitly), with the added drawbacks of - Confusing the meaning of "_", which (at least in my mind) means "discard". - Deviating from other languages with pattern matching (which, presumably, also have bikeshedded on this point), increasing the surprise level for people who are either coming to Python from there, or going from Python to there. - Requiring extra (weird-looking) syntax for the default case of capturing variables. Again, maybe I'm just having trouble finding good use cases (either that, or I have no intuition for programming :P). Let me know if you have some! Rik P.S. If I'm out of line or violating etiquette with anything, please let me know. I'm open to help.
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
-- ~Ethan~
PEP 622 is already on the SC’s agenda for review. -Barry
On Aug 5, 2020, at 09:47, Ethan Furman <ethan@stoneleaf.us> wrote:
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
-- ~Ethan~ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PGEVEI2W... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Barry, How long do we have to present objections to PEP 622? I don't feel that the PEP gives adequate prominence to the objections so far raised, and there are more issues I would like to bring up. Cheers, Mark. On 05/08/2020 5:58 pm, Barry Warsaw wrote:
PEP 622 is already on the SC’s agenda for review.
-Barry
On Aug 5, 2020, at 09:47, Ethan Furman <ethan@stoneleaf.us> wrote:
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
-- ~Ethan~ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PGEVEI2W... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CVYGPSOD... Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Aug 6, 2020 at 3:46 AM Mark Shannon <mark@hotpy.org> wrote:
Hi Barry,
How long do we have to present objections to PEP 622?
We haven't discussed a timeline among ourselves yet (unless it was discussed at the last meeting which missed 😁).
I don't feel that the PEP gives adequate prominence to the objections so far raised, and there are more issues I would like to bring up.
I don't think we would want to keep pushing out every time someone has more to say as that would mean this would never end. 😉 But I doubt we will be making a decision next week, so if you can get any comments in between now and the 17th you will probably get it in before the earliest we will very optimistically make a decision. -Brett
Cheers, Mark.
On 05/08/2020 5:58 pm, Barry Warsaw wrote:
PEP 622 is already on the SC’s agenda for review.
-Barry
On Aug 5, 2020, at 09:47, Ethan Furman <ethan@stoneleaf.us> wrote:
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
-- ~Ethan~ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PGEVEI2W... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CVYGPSOD... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OSDBFZRZ... Code of Conduct: http://python.org/psf/codeofconduct/
Just taking a ride on the thread here, I made a quick talk on the proposed feature for a local group, and in the process I refactored a "real world" class I have in a project, which features a complicated __init__ due having lots of different, optional, ways to be initialized. I can tell I liked what could be done - reducing roughly 60 loc packed with "isinstance" calls, "if/elif" blocks, temporary, intermediate state variables, into 25 lines including 10 case-clauses that are very straightforward to read. Sorry for whoever would like an example differing much of the "point2d" examples on the PEP, but the class in question IS geometry related and is a Rectangle - I am not yet testing (neither on the 'normal' if/else version) _invalid_ arguments - there are a lot of way to pass conflicting arguments to __init__ - and the if/elif logic to handle those properly is not in place. The match/case version for handling these invalid combinations would be very straight forwad, on the other hand . (all said, I think I still miss a way to mark variables that are assigned in the case clauses, just for the record :-) ) Enough cheaptalk - links are here: tests: https://github.com/jsbueno/terminedia/blob/fa5ac012a7b93a2abe26ff6ca41dbd5f5... Branch comparison for the match/case version: https://github.com/jsbueno/terminedia/compare/patma You will notice one "real world" pattern that was needed there: as the case clauses had to be aware of values spread across different keyword-parameters, I had to prepend a "packing" of all function arguments into a mapping to match against. If I would not care for the function signature, I could just get "**kwargs" and match against that. On Thu, 6 Aug 2020 at 14:09, Brett Cannon <brett@python.org> wrote:
On Thu, Aug 6, 2020 at 3:46 AM Mark Shannon <mark@hotpy.org> wrote:
Hi Barry,
How long do we have to present objections to PEP 622?
We haven't discussed a timeline among ourselves yet (unless it was discussed at the last meeting which missed 😁).
I don't feel that the PEP gives adequate prominence to the objections so far raised, and there are more issues I would like to bring up.
I don't think we would want to keep pushing out every time someone has more to say as that would mean this would never end. 😉 But I doubt we will be making a decision next week, so if you can get any comments in between now and the 17th you will probably get it in before the earliest we will very optimistically make a decision.
-Brett
Cheers, Mark.
On 05/08/2020 5:58 pm, Barry Warsaw wrote:
PEP 622 is already on the SC’s agenda for review.
-Barry
On Aug 5, 2020, at 09:47, Ethan Furman <ethan@stoneleaf.us> wrote:
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
-- ~Ethan~ _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PGEVEI2W... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CVYGPSOD... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OSDBFZRZ... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ESAKYRYV... Code of Conduct: http://python.org/psf/codeofconduct/
On 07Aug2020 2133, Joao S. O. Bueno wrote:
Enough cheaptalk - links are here:
tests: https://github.com/jsbueno/terminedia/blob/fa5ac012a7b93a2abe26ff6ca41dbd5f5...
Branch comparison for the match/case version: https://github.com/jsbueno/terminedia/compare/patma
I haven't been following this thread too closely, but that looks pretty nice to me. Not obvious enough for me to write my own just from reading an example, and I'd hesitate before trying to modify it at all, but I can at least read the pre- and post-conditions more easily than in the original.
(all said, I think I still miss a way to mark variables that are assigned in the case clauses, just for the record :-) )
Yeah, the implicit variable assignments seem like the most confusing bit (based solely on looking at just one example). I think I'd be happy enough just knowing that "kw" matches the pattern, and then manually extracting individual values from it. (But I guess for that we'd only need a fancy "if_match(kw, 'expression')" function... hmm...) Cheers, Steve
On Fri, 7 Aug 2020 17:33:04 -0300 "Joao S. O. Bueno" <jsbueno@python.org.br> wrote:
Branch comparison for the match/case version: https://github.com/jsbueno/terminedia/compare/patma
For me, your example is bonkers from the start. Anyone who thinks `Rect(left_or_corner1=None, top_or_corner2=None, right=None, bottom=None, *, width_height=None, width=None, height=None, center=None)` is a reasonable constructor signature for a rectangle class needs to be convinced otherwise, rather than be allowed to find convenient implementation shortcuts for that signature. I'll note that there's no checking for superfluous / exclusive arguments, so apparently I can pass all those arguments at once and the constructor will happily do what it likes, ignoring half the arguments I have without telling? WTH? (and of course, there's no docstring anywhere, so I can only /presume/ this is supposed to be a rectangle class, based on its name) And as a matter of fact, while the pattern matching implementation is definitely shorter, it's still unreadable. I don't know about you, but if I see this during a code review: ``` match kw: case {"c1": (c1:=(_, _)), "c2": (c2:=(_, _))}: pass case {"c1": Rect(c1, c2)}: pass case {"c1": Number(), "c2": Number(), "right": Number(), "bottom": Number()}: c1, c2 = (kw["c1"], kw["c2"]), (right, bottom) case {"c1": (_, _, _, _)}: c1, c2 = kw["c1"][:2], kw["c1"][2:] case {"c1": (c1:=(_, _)), "width_height": (_, _)}: c2 = c1 + V2(width_height) case {"c1": (c1:=(_, _)), "width": Number(), "height": Number()}: c2 = c1 + V2(width, height) case {"c1": (c1:=(_, _)), "c2": None, "right": Number(), "bottom": Number()}: c2 = bottom, right case {"c1": None, "right": Number(), "bottom": Number(), "center": (_, _)}: c1 = 0, 0 c2 = bottom, right case {"c1": (c2:=(_, _)), "c2": None}: c1 = 0, 0 ``` then I will ask the author to think twice about the fact they're inflicting pain on their follow contributors by committing write-only code (in addition to the toxic API design). So, if the aim of the example was to prove that bad APIs could be implemented more tersely using pattern matching, then congratulations, otherwise I'm not impressed at all. Regards Antoine.
On Wed, 5 Aug 2020 09:47:30 -0700 Ethan Furman <ethan@stoneleaf.us> wrote:
On 7/30/20 8:35 AM, Rob Cliffe via Python-Dev wrote:
The debate is still going on as to whether "capture" variables should be marked... I don't think the PEP authors are debating it any more.
That would be a pity. Readability and ease of understanding should trump conciseness. Regards Antoine.
On Wed, Aug 5, 2020 at 9:48 AM Ethan Furman <ethan@stoneleaf.us> wrote:
I don't think the PEP authors are debating it any more. Quite frankly, I wish they would present to the SC and get accepted so we can get Pattern Matching added to 3.10. :)
We did, a few weeks ago, and the SC is already reviewing it. That's all I know. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Sat, Jul 18, 2020 at 3:46 AM Terry Reedy <tjreedy@udel.edu> wrote:
A major points of Kohn's post is that 'case' is analogous to 'def' and match lists are analogous to parameter lists. In parameter lists, untagged simple names ('parameter names') are binding targets. Therefore, untagged simple names in match lists, let us call them 'match names' should be also. I elaborated on this in my response to Tobias.
There are indeed analogous aspects, although not in the most straightfoward/obvious ways. Still, perhaps even more so than there is analogy with assignment targets. This is related to one of my concerns regarding PEP 622. It may be tempting to see pattern matching as a form of assignment. However, that is quite a stretch, both conceptually and as a future direction. There is no way these 'match expressions' could be allowed in regular assignments – the way names are treated just needs to be different. And allowing them in walrus assignments doesn't make much sense either. Conceptually, it is strange to call this match operation an assignment. Most of the added power comes from checking that the object has a certain structure or contents – and in many cases, that is the only thing it does! As a (not always) handy side product, it is also able to assign things to specified targets. Even then, the whole pattern is not assigned to, only parts of it are. In mathematics, assignment (definition) and re-assignment is often denoted with the same sign as equality/identity, because it is usually clear from the context, which one is in question. Usually, however, it matters which one is in question. Therefore, as we well know, we have = for assignment, == for equality, and := to emphasize assignment. Matching is closer to ==, or almost :==. So, in many ways, is the assignment that is special, not the matching. It is also the main thing that differentiates this from the traditional switch–case construct, which the proposed syntax certainly resembles. —Koos
I've built the reference implementation and I'm experimenting with the new syntax in the edgedb codebase. It seems to have plenty of places where pattern matching adds clarity. I'll see if I find particularly interesting examples of that to share. So far I'm +1 on the proposal, and I like the second iteration of it. Except that I'm really sad to see the __match__ protocol gone. Quoting the PEP:
One drawback of this protocol is that the arguments to __match__ would be expensive to construct, and could not be pre-computed due to the fact that, because of the way names are bound, there are no real constants in Python.
While it's not possible to precompute the arguments ahead of time, it certainly should be possible to cache them similarly to how I implemented global names lookup cache in CPython. That should alleviate this particular performance consideration entirely. Having __match__ would allow some really interesting use cases. For example, for binary protocol parsers it would be possible to replicate erlang approach, e.g.: match buffer: case Frame(char('X'), len := UInt32(), flags := Bits(0, 1, flag1, flag2, 1, 1)) would match a Frame of message type 'X', capture its length, and extract two bit flags. This perhaps isn't the greatest example of how a full matching protocol could be used, but it's something that I personally wanted to implement. Yury
On Fri, Jul 17, 2020 at 1:45 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I've built the reference implementation and I'm experimenting with the new syntax in the edgedb codebase. It seems to have plenty of places where pattern matching adds clarity. I'll see if I find particularly interesting examples of that to share.
So far I'm +1 on the proposal, and I like the second iteration of it. Except that I'm really sad to see the __match__ protocol gone.
It will be back, just not in 3.10. We need more experience with how match/case are actually used to design the right `__match__` protocol.
Quoting the PEP:
One drawback of this protocol is that the arguments to __match__ would be expensive to construct, and could not be pre-computed due to the fact that, because of the way names are bound, there are no real constants in Python.
Note: That's not referring to the `__match__` protocol from version 1 of the PEP, but to a hypothetical (and IMO sub-optimal) `__match__` protocol that was discussed among the authors prior to settling on the protocol from version 1.
While it's not possible to precompute the arguments ahead of time, it certainly should be possible to cache them similarly to how I implemented global names lookup cache in CPython. That should alleviate this particular performance consideration entirely.
Where's that global names lookup cache? I seem to have missed its introduction. (Unless you meant PEP 567?)
Having __match__ would allow some really interesting use cases. For example, for binary protocol parsers it would be possible to replicate erlang approach, e.g.:
match buffer: case Frame(char('X'), len := UInt32(), flags := Bits(0, 1, flag1, flag2, 1, 1))
would match a Frame of message type 'X', capture its length, and extract two bit flags. This perhaps isn't the greatest example of how a full matching protocol could be used, but it's something that I personally wanted to implement.
I see, you'd want the *types* of the arguments to be passed into `Frame.__match__`. That's interesting, although I have a feeling that if I had a real use case like this I'd probably be able to come up with a better DSL for specifying messages than this. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Fri, Jul 17, 2020 at 3:54 PM Guido van Rossum <guido@python.org> wrote:
On Fri, Jul 17, 2020 at 1:45 PM Yury Selivanov <yselivanov.ml@gmail.com> wrote:
I've built the reference implementation and I'm experimenting with the new syntax in the edgedb codebase. It seems to have plenty of places where pattern matching adds clarity. I'll see if I find particularly interesting examples of that to share.
So far I'm +1 on the proposal, and I like the second iteration of it. Except that I'm really sad to see the __match__ protocol gone.
It will be back, just not in 3.10. We need more experience with how match/case are actually used to design the right `__match__` protocol.
Makes sense.
Quoting the PEP:
One drawback of this protocol is that the arguments to __match__ would be expensive to construct, and could not be pre-computed due to the fact that, because of the way names are bound, there are no real constants in Python.
Note: That's not referring to the `__match__` protocol from version 1 of the PEP, but to a hypothetical (and IMO sub-optimal) `__match__` protocol that was discussed among the authors prior to settling on the protocol from version 1.
While it's not possible to precompute the arguments ahead of time, it certainly should be possible to cache them similarly to how I implemented global names lookup cache in CPython. That should alleviate this particular performance consideration entirely.
Where's that global names lookup cache? I seem to have missed its introduction. (Unless you meant PEP 567?)
Here are the related bpos of where Inada-san and I worked on this: https://bugs.python.org/issue28158 https://bugs.python.org/issue26219
Having __match__ would allow some really interesting use cases. For example, for binary protocol parsers it would be possible to replicate erlang approach, e.g.:
match buffer: case Frame(char('X'), len := UInt32(), flags := Bits(0, 1, flag1, flag2, 1, 1))
would match a Frame of message type 'X', capture its length, and extract two bit flags. This perhaps isn't the greatest example of how a full matching protocol could be used, but it's something that I personally wanted to implement.
I see, you'd want the *types* of the arguments to be passed into `Frame.__match__`. That's interesting, although I have a feeling that if I had a real use case like this I'd probably be able to come up with a better DSL for specifying messages than this.
Yeah, it's an open question if this is a good idea or not. FWIW here's a relevant quick erlang tutorial: https://dev.to/l1x/matching-binary-patterns-11kh that shows what it looks like in erlang (albeit the syntax is completely alien to Python). Yury
I'm still only intermittently keeping up on python-dev, but my main concern with the first iteration remains in this version, which is that it doesn't even *mention* that the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas: * dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case) The latter could potentially be made internally consistent in the future by redefining "_" and "__" as soft keywords that don't get bound via normal assignment statements either (requiring that they be set via namespace dict modification instead for use cases like il8n). https://www.python.org/dev/peps/pep-0622/#use-some-other-token-as-wildcard presents a reasonable rationale for the usage, so it's only flaw is failing to mention the inconsistency. The former syntactic conflict presents a bigger problem, though, as it means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language. https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa... is nominally about this problem, but it doesn't even *mention* the single biggest benefit of putting a common prefix on value constraints: it leaves the door open to unifying the lvalue syntax again in the future by keeping the proposed match case syntax a strict superset of the existing assignment target syntax, rather than partially conflicting with it. More incidentally, the latest write-up also leaves out "?" as a suggested constraint value prefix, when that's the single character prefix that best implies the question "Does the runtime value at this position equal the result of this value constraint expression?" without having any other existing semantic implications in Python. Cheers, Nick. P.S. I feel I should mention that the other reason I like "?" as a potential prefix for value constraints is that if we require it for all value constraint expressions (both literals and name lookups) I believe it could offer a way to unblock the None-aware expressions PEP by reframing that PEP as a shorthand for particular case matches. None coalescence ("a ?? b") for example: match a: case ?None: _expr_result = b case _match: _expr_result = _match Or a None-severing attribute lookup ("a?.b"): _match_expr = a match _match_expr: case ?None: _expr_result = _match_expr case _match: _expr_result = _match.b Since these operations would be defined in terms of *equality* (as per PEP 622), rather than identity, it would also allow other sentinels to benefit from the None-aware shorthand by defining themselves as being equal to None. On Thu., 9 Jul. 2020, 1:07 am Guido van Rossum, <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases.
This discussion is not new to the authors: before we ever published the first version of the PEP we vigorously debated this (it is Issue 1 in our tracker!), and other languages before us have also had to come to grips with it. Even many statically compiled languages! The reason is that for reasons of usability it’s usually deemed important that their equivalent of `case` auto-declare the captured variables, and variable declarations may hide (override) like-named variables in outer scopes.
Scala, for example, uses several different rules: first, capture variable names must start with a lowercase letter (so it would handle the above example as intended); next, capture variables cannot be dotted names (like `mod.var`); finally, you can enclose any variable in backticks to force the compiler to see it as a load instead of a store. Elixir uses another form of markup for loads: `x` is a capture variable, but `^x` loads and compares the value of `x`.
There are a number of dead ends when looking for a solution that works for Python. Checking at runtime whether a name is defined or not is one of these: there are numerous reasons why this could be confusing, not the least of which being that the `match` may be executed in a loop and the variable may already be bound by a previous iteration. (True, this has to do with the scope we’ve adopted for capture variables. But believe me, giving each case clause its own scope is quite horrible by itself, and there are other action-at-a-distance effects that are equally bad.)
It’s been proposed to explicitly state the names of the variables bound in a header of the `match` statement; but this doesn’t scale when the number of cases becomes larger, and requires users to do bookkeeping the compiler should be able to do. We’re really looking for a solution that tells you when you’re looking at an individual `case` which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. `$x` or `x?`) or other markup (e.g. backticks or `<x>`) makes this common case ugly and inconsistent: it’s unpleasant to see for example ``` case %x, %y: print(x, y) ``` No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`.
We believe it’s acceptable that things looking like `mod.var` are never considered capture variables -- the common use cases for `match` are such that one would almost never want to capture into a different namespace. (Just like you very rarely see `for self.i in …` and never `except E as scope.var` -- the latter is illegal syntax and sets a precedent.)
One author would dearly have seen Scala’s uppercase rule adopted, but in the end was convinced by the others that this was a bad idea, both because there’s no precedent in Python’s syntax, and because many human languages simply don’t make the distinction between lowercase and uppercase in their writing systems.
So what should you do if you have a local variable (say, a function argument) that you want to use as a value in a pattern? One solution is to capture the value in another variable and use a guard to compare that variable to the argument: ``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ``` If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case.
There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question.
Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.
If you've made it so far, here are the links to check out, with an open mind. As a reminder, the introductory sections (Abstract, Overview, and Rationale and Goals) have been entirely rewritten and also serve as introduction and tutorial.
- PEP 622: https://www.python.org/dev/peps/pep-0622/ - Playground: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playgrou...
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LOXEATGF... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Jul 29, 2020 at 4:34 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm still only intermittently keeping up on python-dev, but my main concern with the first iteration remains in this version, which is that it doesn't even *mention* that the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas:
I don't see why the PEP would be required to mention this. You make it sound like it's a bad thing (a "conflict"), whereas the PEP authors' position is that it is irrelevant.
* dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case)
The latter could potentially be made internally consistent in the future by redefining "_" and "__" as soft keywords that don't get bound via normal assignment statements either (requiring that they be set via namespace dict modification instead for use cases like il8n). https://www.python.org/dev/peps/pep-0622/#use-some-other-token-as-wildcard presents a reasonable rationale for the usage, so it's only flaw is failing to mention the inconsistency.
That solution is outside the scope of the PEP -- it would be a big backward incompatibility with little payoff. Your repeated mention of consistency makes me want to quote PEP 8 (quoting Emerson, though I didn't even know who that was when I put it in my style guide :-): "A foolish consistency is the hobgoblin of little minds."
The former syntactic conflict presents a bigger problem, though, as it means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language.
Things become much less "conflict-y" if you stop seeing patterns as lvalues. They just aren't, and any argument based on the idea that they are is inherently flawed. (Also note that the concept of lvalue isn't even defined in Python. There are a variety of assignment targets with different syntactic constraints depending on context, and several other syntactic constructs that bind names.)
https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa... is nominally about this problem, but it doesn't even *mention* the single biggest benefit of putting a common prefix on value constraints: it leaves the door open to unifying the lvalue syntax again in the future by keeping the proposed match case syntax a strict superset of the existing assignment target syntax, rather than partially conflicting with it.
That's because the PEP authors disagree with you that this goal is worthy of pursuit, and hence we don't see care about this benefit at all.
More incidentally, the latest write-up also leaves out "?" as a suggested constraint value prefix, when that's the single character prefix that best implies the question "Does the runtime value at this position equal the result of this value constraint expression?" without having any other existing semantic implications in Python.
In the discussion pretty much all non-alphanumeric ASCII characters have been proposed by various people as sigils to mark either capture patterns or value patterns. We didn't think it was necessary to enumerate all proposed characters and write up reasons why we reject them, since the reasons for rejection are pretty much always the same -- it looks strange, and there's no need for sigils at all. Honestly, it doesn't help the case for `?` that it's been proposed as a mark for both capture patterns and value patterns (by different people, obviously :-).
Cheers, Nick.
P.S. I feel I should mention that the other reason I like "?" as a potential prefix for value constraints is that if we require it for all value constraint expressions (both literals and name lookups) I believe it could offer a way to unblock the None-aware expressions PEP by reframing that PEP as a shorthand for particular case matches.
None coalescence ("a ?? b") for example:
match a: case ?None: _expr_result = b case _match: _expr_result = _match
Or a None-severing attribute lookup ("a?.b"):
_match_expr = a match _match_expr: case ?None: _expr_result = _match_expr case _match: _expr_result = _match.b
Since these operations would be defined in terms of *equality* (as per PEP 622), rather than identity, it would also allow other sentinels to benefit from the None-aware shorthand by defining themselves as being equal to None.
This sounds like a huge stretch. Trying to forge a connection between two separate uses of the same character sounds like arguing that the `*` in `a * b` and the `*` in `*args` are really the same operator. I am actually rather in favor of PEP 505, but that doesn't make any difference for how I see marking value patterns in PEP 622. --Guido
On Thu., 9 Jul. 2020, 1:07 am Guido van Rossum, <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. [...]
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Fri., 31 Jul. 2020, 3:14 am Guido van Rossum, <guido@python.org> wrote:
On Wed, Jul 29, 2020 at 4:34 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm still only intermittently keeping up on python-dev, but my main concern with the first iteration remains in this version, which is that it doesn't even *mention* that the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas:
I don't see why the PEP would be required to mention this. You make it sound like it's a bad thing (a "conflict"), whereas the PEP authors' position is that it is irrelevant.
* dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case)
The latter could potentially be made internally consistent in the future by redefining "_" and "__" as soft keywords that don't get bound via normal assignment statements either (requiring that they be set via namespace dict modification instead for use cases like il8n). https://www.python.org/dev/peps/pep-0622/#use-some-other-token-as-wildcard presents a reasonable rationale for the usage, so it's only flaw is failing to mention the inconsistency.
That solution is outside the scope of the PEP -- it would be a big backward incompatibility with little payoff. Your repeated mention of consistency makes me want to quote PEP 8 (quoting Emerson, though I didn't even know who that was when I put it in my style guide :-): "A foolish consistency is the hobgoblin of little minds."
I don't really like that future possibility either - I think it would be much better for PEP 622 to let "_" be a binding throwaway variable as normal, and allow a bare "?" as the "match any expression without binding it" marker. But unlike the reuse of attribute assignment syntax for a different purpose, it's a conflict that I don't think matters very much (as it's incredibly rare to care whether binding "_" actually creates a reference or not, so having it bind sometimes and not others isn't likely to present any major barriers to learning).
The former syntactic conflict presents a bigger problem, though, as it
means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language.
Things become much less "conflict-y" if you stop seeing patterns as lvalues. They just aren't, and any argument based on the idea that they are is inherently flawed.
That's conceding my point, though: aside from iterable unpacking, the PEP isn't currently even trying to keep pattern matching syntax consistent with assignment target syntax, as the PEP authors haven't even considered the idea of pursuing a higher level of consistency as a design goal. A section titled "Match patterns are not assignment targets" that explains that even though match patterns bind names and do iterable unpacking the same way assignment targets do, it is nevertheless incorrect for a reader to think of them as assignment targets would address my concern (it wouldn't convince me that it is a good idea to view the design problem that way, but I would accept that the argument had been heard and addressed in a context where the future PEP delegate will necessarily see it). (Also note that the concept of lvalue isn't even defined in Python. There
are a variety of assignment targets with different syntactic constraints depending on context, and several other syntactic constructs that bind names.)
Right, I just use "lvalue" as a shorthand for "syntax that can bind a name". All the others are strict subsets of the full assignment target syntax, though, mostly falling into either "simple names only" or "simple names and iterable unpacking, but no attributes or subscripts". I'll use "name binding context" for the full set of those below. This PEP is the first time it has been proposed to accept valid assignment target syntax in a name binding context, but have it mean something different. The fact that the PEP doesn't even acknowledge that this is a potential problem is the biggest part of the problem. If the problem was acknowledged, and addressed, then readers could evaluate the merits of PEP authors' arguments against it. As it is, unless the reader identifies the conflict on their own, they may not realise what is bugging them about it, and make the same mistake I initially did and believe it's the binding syntax that's inconsistent (when that's actually entirely consistent with for loops, for example), rather than the constraint lookup syntax (which has never previously been allowed in a name binding context). We know the PEP authors don't see patterns as assignment targets beyond sharing the iterable unpacking syntax, but my concern is for everyone *else* that is either learning pattern matching as an existing Python developer, or learning Python in general after match statements are added, and is trying to figure out why these two snippets do the same thing: x = y match y: case x: pass and so do these: a, b = x, y match (x, y): case a, b: pass while the second snippet here throws NameError if the attribute doesn't exist yet, or silently does nothing if it does: x.a = y match y: case x.a: pass If that consistently threw SyntaxError instead (which is what I am suggesting it should do in the absence of a leading "?" on the constraint expression), then it would be similar to the many other places where we allow binding simple names and iterable unpacking, but not attributes or subscripts. As it is, it's the tipping point where learners will be forced to realise that there is a semantic inconsistency between pattern matching syntax and assignment target syntax. If the PEP authors are deliberately championing that inconsistency, then it will be up to the PEP delegate to decide if it is a deal breaker or not. But right now, the PEP isn't even acknowledging that this is a significant design decision that the PEP authors have made. Cheers, Nick.
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Trust me, the PEP authors are well aware. If we hadn't been from the outset, a hundred different proposals to "deal" with this would have. And many of those proposals actually made it into the list of rejected ideas. Moreover, we rewrote a huge portion of the PEP from scratch as a result (everything from Abstract up to the entire Rationale and Goals section). Apart from your insistence that we "acknowledge" an "inconsistency", your counter-proposal is not so different from the others. Let's agree to disagree on the best syntax for patterns. On Fri, Jul 31, 2020 at 5:21 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On Fri., 31 Jul. 2020, 3:14 am Guido van Rossum, <guido@python.org> wrote:
On Wed, Jul 29, 2020 at 4:34 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
I'm still only intermittently keeping up on python-dev, but my main concern with the first iteration remains in this version, which is that it doesn't even *mention* that the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas:
I don't see why the PEP would be required to mention this. You make it sound like it's a bad thing (a "conflict"), whereas the PEP authors' position is that it is irrelevant.
* dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case)
The latter could potentially be made internally consistent in the future by redefining "_" and "__" as soft keywords that don't get bound via normal assignment statements either (requiring that they be set via namespace dict modification instead for use cases like il8n). https://www.python.org/dev/peps/pep-0622/#use-some-other-token-as-wildcard presents a reasonable rationale for the usage, so it's only flaw is failing to mention the inconsistency.
That solution is outside the scope of the PEP -- it would be a big backward incompatibility with little payoff. Your repeated mention of consistency makes me want to quote PEP 8 (quoting Emerson, though I didn't even know who that was when I put it in my style guide :-): "A foolish consistency is the hobgoblin of little minds."
I don't really like that future possibility either - I think it would be much better for PEP 622 to let "_" be a binding throwaway variable as normal, and allow a bare "?" as the "match any expression without binding it" marker.
But unlike the reuse of attribute assignment syntax for a different purpose, it's a conflict that I don't think matters very much (as it's incredibly rare to care whether binding "_" actually creates a reference or not, so having it bind sometimes and not others isn't likely to present any major barriers to learning).
The former syntactic conflict presents a bigger problem, though, as it
means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language.
Things become much less "conflict-y" if you stop seeing patterns as lvalues. They just aren't, and any argument based on the idea that they are is inherently flawed.
That's conceding my point, though: aside from iterable unpacking, the PEP isn't currently even trying to keep pattern matching syntax consistent with assignment target syntax, as the PEP authors haven't even considered the idea of pursuing a higher level of consistency as a design goal.
A section titled "Match patterns are not assignment targets" that explains that even though match patterns bind names and do iterable unpacking the same way assignment targets do, it is nevertheless incorrect for a reader to think of them as assignment targets would address my concern (it wouldn't convince me that it is a good idea to view the design problem that way, but I would accept that the argument had been heard and addressed in a context where the future PEP delegate will necessarily see it).
(Also note that the concept of lvalue isn't even defined in Python. There
are a variety of assignment targets with different syntactic constraints depending on context, and several other syntactic constructs that bind names.)
Right, I just use "lvalue" as a shorthand for "syntax that can bind a name". All the others are strict subsets of the full assignment target syntax, though, mostly falling into either "simple names only" or "simple names and iterable unpacking, but no attributes or subscripts". I'll use "name binding context" for the full set of those below.
This PEP is the first time it has been proposed to accept valid assignment target syntax in a name binding context, but have it mean something different. The fact that the PEP doesn't even acknowledge that this is a potential problem is the biggest part of the problem. If the problem was acknowledged, and addressed, then readers could evaluate the merits of PEP authors' arguments against it.
As it is, unless the reader identifies the conflict on their own, they may not realise what is bugging them about it, and make the same mistake I initially did and believe it's the binding syntax that's inconsistent (when that's actually entirely consistent with for loops, for example), rather than the constraint lookup syntax (which has never previously been allowed in a name binding context).
We know the PEP authors don't see patterns as assignment targets beyond sharing the iterable unpacking syntax, but my concern is for everyone *else* that is either learning pattern matching as an existing Python developer, or learning Python in general after match statements are added, and is trying to figure out why these two snippets do the same thing:
x = y
match y: case x: pass
and so do these:
a, b = x, y
match (x, y): case a, b: pass
while the second snippet here throws NameError if the attribute doesn't exist yet, or silently does nothing if it does:
x.a = y
match y: case x.a: pass
If that consistently threw SyntaxError instead (which is what I am suggesting it should do in the absence of a leading "?" on the constraint expression), then it would be similar to the many other places where we allow binding simple names and iterable unpacking, but not attributes or subscripts.
As it is, it's the tipping point where learners will be forced to realise that there is a semantic inconsistency between pattern matching syntax and assignment target syntax.
If the PEP authors are deliberately championing that inconsistency, then it will be up to the PEP delegate to decide if it is a deal breaker or not. But right now, the PEP isn't even acknowledging that this is a significant design decision that the PEP authors have made.
Cheers, Nick.
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Sat., 1 Aug. 2020, 10:55 am Guido van Rossum, <guido@python.org> wrote:
Trust me, the PEP authors are well aware. If we hadn't been from the outset, a hundred different proposals to "deal" with this would have. And many of those proposals actually made it into the list of rejected ideas. Moreover, we rewrote a huge portion of the PEP from scratch as a result (everything from Abstract up to the entire Rationale and Goals section).
Apart from your insistence that we "acknowledge" an "inconsistency", your counter-proposal is not so different from the others.
Right, there are several ways the PEP could be adjusted so that assignment target syntax and pattern matching syntax had consistent semantics whenever they share syntax, just as other name binding syntaxes are already strict subsets of the full assignment target syntax. I personally like "Use '?' as an explicit constraint expression prefix", but it's far from being the only possibility. But if we don't even agree that common syntax in a name binding context should either always mean the same thing, or else be a syntax error, then we're not going to agree that there's a problem to be solved in the first place. Let's agree to disagree on the best syntax for patterns
I think our disagreement is more fundamental than that, as I believe there should be a common metasyntax for imperative name binding (i.e. everything except function parameters) that all actual name binding contexts allow a subset of, while the PEP authors feel it's OK to treat pattern matching as a completely new design entity that only incidentally shares some common syntax with assignment targets. Prior to PEP 622, the apparent design constraint that I had inferred was implicitly met by the fact that all the imperative name binding operations accept a subset of the full assignment target syntax, so it's never actually come up before whether this is a real design goal for the language, or just a quirk of history. PEP 622 is forcing that question to be answered explicitly, as accepting it in its current form would mean telling me, and everyone else that had inferred a similar design concept, that we need to adjust our thinking. I'd obviously prefer it if the PEP chose a different syntax that avoided the semantic conflict with assignment for dotted names, but in the absence of that, I'd settle for the explicit statement that we're wrong and inferred a design principle that never actually existed. Cheers, Nick.
That's a strawman argument. I am done arguing about this. On Fri, Jul 31, 2020 at 7:47 PM Nick Coghlan <ncoghlan@gmail.com> wrote:
On Sat., 1 Aug. 2020, 10:55 am Guido van Rossum, <guido@python.org> wrote:
Trust me, the PEP authors are well aware. If we hadn't been from the outset, a hundred different proposals to "deal" with this would have. And many of those proposals actually made it into the list of rejected ideas. Moreover, we rewrote a huge portion of the PEP from scratch as a result (everything from Abstract up to the entire Rationale and Goals section).
Apart from your insistence that we "acknowledge" an "inconsistency", your counter-proposal is not so different from the others.
Right, there are several ways the PEP could be adjusted so that assignment target syntax and pattern matching syntax had consistent semantics whenever they share syntax, just as other name binding syntaxes are already strict subsets of the full assignment target syntax. I personally like "Use '?' as an explicit constraint expression prefix", but it's far from being the only possibility.
But if we don't even agree that common syntax in a name binding context should either always mean the same thing, or else be a syntax error, then we're not going to agree that there's a problem to be solved in the first place.
Let's agree to disagree on the best syntax for patterns
I think our disagreement is more fundamental than that, as I believe there should be a common metasyntax for imperative name binding (i.e. everything except function parameters) that all actual name binding contexts allow a subset of, while the PEP authors feel it's OK to treat pattern matching as a completely new design entity that only incidentally shares some common syntax with assignment targets.
Prior to PEP 622, the apparent design constraint that I had inferred was implicitly met by the fact that all the imperative name binding operations accept a subset of the full assignment target syntax, so it's never actually come up before whether this is a real design goal for the language, or just a quirk of history.
PEP 622 is forcing that question to be answered explicitly, as accepting it in its current form would mean telling me, and everyone else that had inferred a similar design concept, that we need to adjust our thinking.
I'd obviously prefer it if the PEP chose a different syntax that avoided the semantic conflict with assignment for dotted names, but in the absence of that, I'd settle for the explicit statement that we're wrong and inferred a design principle that never actually existed.
Cheers, Nick.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 30/07/2020 00:34, Nick Coghlan wrote:
the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas:
* dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case)
The former syntactic conflict presents a bigger problem, though, as it means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language.
+1
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. After all the discussion on the issue, I can still not stop thinking
On 08/07/2020 16:02, Guido van Rossum wrote: that there needs to be a visual distinction between "capture" and "match" variables. Having rules ("plain names are capture, dotted names are match") is one more thing to be learnt. One more bit of mystery when (a near newbie is) reading code. Possible compromise: *two* sigils - one for capture, one for match. Both would be optional, only required when the default is not what is wanted, but could be added regardless if the author felt it added clarity.
Adding this feature would be a giant quality of life improvement for me and I really hope it succeeds. So I have been trying to keep up on the debate in this and related thread. While I do want this feature, I agree with a lot of the issues people are raising. First I agree that _ should not be the wildcard symbol. Or rather the hobgoblins in my mind think that if _ is to be the wildcard symbol it would be more consistent with assignment if it was bound to the last value it was matched with (as should other repeated identifiers) e.g., match pt: case (x, _, _): assert _ == pt[2] I understand the authors rationalize the decision based on conventions with the gettext module. I find these arguments very unconvincing. It's like saying the identifier np should be forbidden from appearing in cases because it's frequently used as the name of numpy. If there is to be a wildcard symbol (that does not bind and is repeatable) it should not be a valid identifier. Second, the distinction between a capture and constant value pattern should be more explicit. I don't have any great insight into the color of the shed beyond the numerous suggestions already made (name=, ?name, etc...), but it seems quite unintuitive to me that I can't capture into a namespace nor match a constant without a namespace. It is also unclear to me why it would be so terrible to add a new token or abuse an existing one.
Honestly, it doesn't help the case for `?` that it's been proposed as a mark for both capture patterns and value patterns (by different people, obviously :-).
I agree that most of the proposed sheds don't necessarily make it intuitively clear what is a capture variable vs what is a constant. However they do give the programmer the ability to choose. For example if I want to modify the motivating example from the PEP slightly to copy attributes from one point to another I can't express it concisely: def update_point_3d(p: Point3d, pt): match pt: case (x, y): p.x, p.y = x, y case Point2d(x, y): p.x, p.y = x, y ... (Okay I could have just called the original make_point_3d and unpacked the results but it would require the creation of an unnecessary temporary.) However if the capture was explicit and any valid target could be used as a capture variable then I could express this cleanly: def update_point_3d(p: Point3d, pt): match pt: case (p.x=, p.y=): pass case Point2d(p.x=, p.y=): pass ... - Caleb Donovick
On 7/30/20 4:31 PM, Caleb Donovick wrote:
However if the capture was explicit and any valid target could be used as a capture variable then I could express this cleanly: |def update_point_3d(p: Point3d, pt): match pt: case (p.x=, p.y=): pass case Point2d(p.x=, p.y=): pass ... |
I like this proposal, using = to explicitly specify when capturing. I see it as a big improvement over the current PEP where dotted names are never assigned to and non-dotted names usually are. It also leads directly to an alternate proposal for the wildcard pattern: a "=" not prefaced with an lvalue. This has the benefit of having no conflict with i18n, etc. Explicit is better than implicit, //arry/
Hi Caleb, I will only answer to the second part, as the wildcard issue has been brought up and discussed time and again, and the `np` analogue is quite a stretch and far-fetched, really. One thing that stood out a bit to me as I feel to have seen it a couple of times is the question of intuition, so I will add a few more general thoughts to that...
[...] but it seems quite unintuitive to me [...]
[...] don't necessarily make it intuitively clear [...]
Intuition (or lack thereof) has already been brought forward as an argument a couple of times. I would just like to briefly point out that there is no such thing as universal intuition in the field of programming. We all have different training, skills, preferences and experiences, which make up what we call 'intuition'. But what is intuitive is usually something completely different to C-programmer than to a Haskell- or Lisp-Programmer, say. And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case. As for beginners, virtually every part of programming is unintuitive at first. Even something innocuous-looking like assignment is often reason for confusion because `3 + 4 = x` would probably be more 'intuitive'. But there is good reason with regards to the bigger picture to stick to `x = 3 + 4`. A Python-programmer (at any level) not familiar with pattern matching will most likely not understand all subtlety of the syntax---but this is alos true of features like `async` or the `/` in parameters, say. I would argue, though, that the clear choice of keywords allow anyone to quickly look pattern matching up and get informed on what it does. So, we do not need to come with something that is entirely 'intuitive' and 'self-evident'. But by sticking to common convention like `_` as wildcard, we can help quickly build the correct intuition. In your examples, for instance, it is perfectly obvious to me that you cannot directly assign to attributes and it would in fact look very weird to my eyes if you could. Your use case is quite similar to initialisers and you are arguing that you would like being able to write: ``` CLASS Point: DEF __init__(self, self.x, self.y): PASS ``` rather than the more verbose: ``` CLASS Point: DEF __init__(self, x, y): self.x, self.y = x, y ``` I do not think that this would be a good idea for either parameters or patterns. After all, pattern matching is */not/* assignment, even though it is related to it, of course. Kind regards, Tobias Quoting Caleb Donovick <donovick@cs.stanford.edu>:
Adding this feature would be a giant quality of life improvement for me and I really hope it succeeds. So I have been trying to keep up on the debate in this and related thread.
While I do want this feature, I agree with a lot of the issues people are raising.
First I agree that _ should not be the wildcard symbol. Or rather the hobgoblins in my mind think that if _ is to be the wildcard symbol it would be more consistent with assignment if it was bound to the last value it was matched with (as should other repeated identifiers) e.g., match pt: case (x, _, _): assert _ == pt[2] I understand the authors rationalize the decision based on conventions with the gettext module. I find these arguments very unconvincing. It's like saying the identifier np should be forbidden from appearing in cases because it's frequently used as the name of numpy. If there is to be a wildcard symbol (that does not bind and is repeatable) it should not be a valid identifier.
Second, the distinction between a capture and constant value pattern should be more explicit. I don't have any great insight into the color of the shed beyond the numerous suggestions already made (name=, ?name, etc...), but it seems quite unintuitive to me that I can't capture into a namespace nor match a constant without a namespace. It is also unclear to me why it would be so terrible to add a new token or abuse an existing one.
> Honestly, it doesn't help the case for `?` that it's been proposed as a mark for both capture patterns and value patterns (by different people, obviously :-).
I agree that most of the proposed sheds don't necessarily make it intuitively clear what is a capture variable vs what is a constant. However they do give the programmer the ability to choose.
For example if I want to modify the motivating example from the PEP slightly to copy attributes from one point to another I can't express it concisely: def update_point_3d(p: Point3d, pt): match pt: case (x, y): p.x, p.y = x, y case Point2d(x, y): p.x, p.y = x, y ...
(Okay I could have just called the original make_point_3d and unpacked the results but it would require the creation of an unnecessary temporary.)
However if the capture was explicit and any valid target could be used as a capture variable then I could express this cleanly: def update_point_3d(p: Point3d, pt): match pt: case (p.x=, p.y=): pass case Point2d(p.x=, p.y=): pass ...
- Caleb Donovick
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed. Take for example list comprehensions. Python 1 programmers hadn't seen a = [x for x in y] But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately. The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time. Cheers, //arry/
+10. See https://stackoverflow.com/questions/36825925/expressions-with-true-and-is-tr... for concrete evidence where another semantically inconsistent operator overloading caused trouble and what Stroustroup has to say on the matter. On 31.07.2020 13:42, Larry Hastings wrote:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
//arry/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q5KULD7E... Code of Conduct: http://python.org/psf/codeofconduct/ -- Regards, Ivan
1. Semantic operator overloading in generic contexts is very different from this use case. It's surrounded by a clear context. 2. Python programmer intuition varies across python programmers, and I would find it hella unintuitive if I had to explicitly capture every variable. I just want to write down what the thing looks like and have the interpreter figure out the correct bindings. Extra binding syntax will get in the way rather than be helpful. Python Dev <python-dev@python.org> wrote: “+10. See https://stackoverflow.com/questions/36825925/expressions-with-true-and-is-tr... for concrete evidence where another semantically inconsistent operator overloading caused trouble and what Stroustroup has to say on the matter. On 31.07.2020 13:42, Larry Hastings wrote: “ On 7/31/20 12:36 AM, Tobias Kohn wrote:“And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case. ” It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed. Take for example list comprehensions. Python 1 programmers hadn't seen “a = [x for x in y] ” But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately. The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time. Cheers, /arry _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.orghttps://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Q5KULD7E... Code of Conduct: http://python.org/psf/codeofconduct/ -- Regards, Ivan”” [attachment.txt]
On 31/07/2020 17:24, Rik de Kort via Python-Dev wrote:
1. Semantic operator overloading in generic contexts is very different from this use case. It's surrounded by a clear context. 2. Python programmer intuition varies across python programmers, and I would find it hella unintuitive if I had to explicitly capture every variable. I just want to write down what the thing looks like and have the interpreter figure out the correct bindings. Extra binding syntax will get in the way rather than be helpful.
Until you want to do something slightly different, and the interpreter's choice is not what you want.
Python Dev <python-dev@python.org> wrote:
+10. See https://stackoverflow.com/questions/36825925/expressions-with-true-and-is-tr... for concrete evidence where another semantically inconsistent operator overloading caused trouble and what Stroustroup has to say on the matter.
On 31.07.2020 13:42, Larry Hastings wrote:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
//arry/
_______________________________________________
Python-Dev mailing list --python-dev@python.org <mailto:python-dev@python.org>
To unsubscribe send an email topython-dev-leave@python.org <mailto:python-dev-leave@python.org>
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived athttps://mail.python.org/archives/list/python-dev@python.org/message/Q5KULD7E...
Code of Conduct:http://python.org/psf/codeofconduct/
-- Regards, Ivan
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4L7LXGVY... Code of Conduct: http://python.org/psf/codeofconduct/
Hi Larry, You are right that just dismissing intuition is wrong. I should have been more careful with my wording or explain them better, and I would like to apologise if my response came across as too strong in this regard. The actual problem that I see is that we have different cultures/intuitions fundamentally clashing here. In particular, so many programmers welcome pattern matching as an "extended switch statement" and find it therefore strange that names are binding and not expressions for comparison. Others argue that it is at odds with current assignment statements, say, and question why dotted names are _/not/_ binding. What all groups seem to have in common, though, is that they refer to _/their/_ understanding and interpretation of the new match statement as 'consistent' or 'intuitive'---naturally pointing out where we as PEP authors went wrong with our design. But here is the catch: at least in the Python world, pattern matching as proposed by this PEP is an unprecedented and new way of approaching a common problem. It is not simply an extension of something already there. Even worse: while designing the PEP we found that no matter from which angle you approach it, you will run into issues of seeming 'inconsistencies' (which is to say that pattern matching cannot be reduced to a 'linear' extension of existing features in a meaningful way): there is always something that goes fundamentally beyond what is already there in Python. That's why I argue that arguments based on what is 'intuitive' or 'consistent' just do not make sense _/in this case/_. I think the discussion on this mailing list with the often contradictory views, proposals, and counter-proposals more than makes my point. As for your argument that it looks like calling a function or creating an object: I tried to explain a little while ago that you'd be well advised to rather approach it as something similar to a function _/definition/_. After all, the part after `def` in `def foo(a, b):` also looks like a function call! But nobody seems to mind this similarity in syntax there! And the target in `(a, b) = c` looks like a tuple constructor, although it actually is the exact opposite. Finally, I completely agree that intuition is informed by experience and serving us very well. The first part of this, however, is also to say that intuition is malleable thing! And experience from other programming languages who took the leap to having pattern matching shows that it quickly becomes a quite intuitive and easy to use feature. Cheers, Tobias P.S. Please excuse my late reply; I am currently on vacation. Quoting Larry Hastings <larry@hastings.org>:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
//arry/
On Tue, Aug 4, 2020 at 1:37 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
And experience from other programming languages who took the leap to having pattern matching shows that it quickly becomes a quite intuitive and easy to use feature.
The languages I know about that have pattern matching had it from the start as a core feature. I am curious to learn about languages that adopted pattern matching later in their evolution. Cheers, Luciano
Cheers, Tobias
P.S. Please excuse my late reply; I am currently on vacation.
Quoting Larry Hastings <larry@hastings.org>:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
/arry
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4E... Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg
Off the top of my head for recently happened and fairly mainstream language: C# added it in 8.0 https://docs.microsoft.com/en-us/archive/msdn-magazine/2019/may/csharp-8-0-p... On Wed, Aug 5, 2020 at 3:33 PM Luciano Ramalho <luciano@ramalho.org> wrote:
On Tue, Aug 4, 2020 at 1:37 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
And experience from other programming languages who took the leap to having pattern matching shows that it quickly becomes a quite intuitive and easy to use feature.
The languages I know about that have pattern matching had it from the start as a core feature.
I am curious to learn about languages that adopted pattern matching later in their evolution.
Cheers,
Luciano
Cheers, Tobias
P.S. Please excuse my late reply; I am currently on vacation.
Quoting Larry Hastings <larry@hastings.org>:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax".
There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant
"creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes
what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
/arry
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4E...
Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PI44AV5C... Code of Conduct: http://python.org/psf/codeofconduct/
It's interesting to consider how C# did it. For example, at the same time they added pattern matching, they also added "discards", which are (undeclared-only?) variables whose name starts with '_' and whose value is never retained. I'm not sure, but I believe the language previously permitted (and still permits) conventional variables that started with '_'. My guess is that that's now discouraged, and new code is encouraged to only use identifiers starting with '_' as discards. And, a minor correction: C# added pattern matching (and discards) in version 7, though they did extend the syntax in version 8. Cheers, //arry/ On 8/5/20 2:04 PM, Robert White wrote:
Off the top of my head for recently happened and fairly mainstream language: C# added it in 8.0 https://docs.microsoft.com/en-us/archive/msdn-magazine/2019/may/csharp-8-0-p...
On Wed, Aug 5, 2020 at 3:33 PM Luciano Ramalho <luciano@ramalho.org <mailto:luciano@ramalho.org>> wrote:
On Tue, Aug 4, 2020 at 1:37 PM Tobias Kohn <kohnt@tobiaskohn.ch <mailto:kohnt@tobiaskohn.ch>> wrote: > And experience from other programming languages who took the leap to having > pattern matching shows that it quickly becomes a quite intuitive and easy to use feature.
The languages I know about that have pattern matching had it from the start as a core feature.
I am curious to learn about languages that adopted pattern matching later in their evolution.
Cheers,
Luciano
> > Cheers, > Tobias > > P.S. Please excuse my late reply; I am currently on vacation. > > > > Quoting Larry Hastings <larry@hastings.org <mailto:larry@hastings.org>>: > > > > On 7/31/20 12:36 AM, Tobias Kohn wrote: > > And since pattern matching is really > a new feature to be introduced to Python, a feature that can > be seen in different lights, there is no 'Python-Programmer > intuition' that would apply in this case. > > It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed. > > Take for example list comprehensions. Python 1 programmers hadn't seen > > a = [x for x in y] > > But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately. > > The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time. > > Cheers, > > > > /arry > > > > _______________________________________________ > Python-Dev mailing list -- python-dev@python.org <mailto:python-dev@python.org> > To unsubscribe send an email to python-dev-leave@python.org <mailto:python-dev-leave@python.org> > https://mail.python.org/mailman3/lists/python-dev.python.org/ > Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4E... > Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg _______________________________________________ Python-Dev mailing list -- python-dev@python.org <mailto:python-dev@python.org> To unsubscribe send an email to python-dev-leave@python.org <mailto:python-dev-leave@python.org> https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PI44AV5C... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PZMWDYEL... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed, Aug 5, 2020 at 8:14 PM Larry Hastings <larry@hastings.org> wrote:
It's interesting to consider how C# did it. For example, at the same time they added pattern matching, they also added "discards", which are (undeclared-only?) variables whose name starts with '_' and whose value is never retained. I'm not sure, but I believe the language previously permitted (and still permits) conventional variables that started with '_'. My guess is that that's now discouraged, and new code is encouraged to only use identifiers starting with '_' as discards.
And, a minor correction: C# added pattern matching (and discards) in version 7, though they did extend the syntax in version 8.
Yes, that was my goal when I asked about pattern matching added to a language after its initial design: maybe we could learn something about how to adopt this feature gradually instead of all at once.
Cheers,
/arry
On 8/5/20 2:04 PM, Robert White wrote:
Off the top of my head for recently happened and fairly mainstream language: C# added it in 8.0 https://docs.microsoft.com/en-us/archive/msdn-magazine/2019/may/csharp-8-0-p...
On Wed, Aug 5, 2020 at 3:33 PM Luciano Ramalho <luciano@ramalho.org> wrote:
On Tue, Aug 4, 2020 at 1:37 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
And experience from other programming languages who took the leap to having pattern matching shows that it quickly becomes a quite intuitive and easy to use feature.
The languages I know about that have pattern matching had it from the start as a core feature.
I am curious to learn about languages that adopted pattern matching later in their evolution.
Cheers,
Luciano
Cheers, Tobias
P.S. Please excuse my late reply; I am currently on vacation.
Quoting Larry Hastings <larry@hastings.org>:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax". There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant "creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
/arry
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4E... Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PI44AV5C... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PZMWDYEL... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CGR7J6B2... Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg
Javascript hasn't it yet, but there is an active proposal for it in the standardization committee: https://github.com/tc39/proposal-pattern-matching On Wed, 5 Aug 2020 at 21:34, Luciano Ramalho <luciano@ramalho.org> wrote:
On Tue, Aug 4, 2020 at 1:37 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
And experience from other programming languages who took the leap to having pattern matching shows that it quickly becomes a quite intuitive and easy to use feature.
The languages I know about that have pattern matching had it from the start as a core feature.
I am curious to learn about languages that adopted pattern matching later in their evolution.
Cheers,
Luciano
Cheers, Tobias
P.S. Please excuse my late reply; I am currently on vacation.
Quoting Larry Hastings <larry@hastings.org>:
On 7/31/20 12:36 AM, Tobias Kohn wrote:
And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
It's not fair to say "intuition doesn't apply because it's new syntax".
There are plenty of examples of intuition serving a Python programmer well when encountering new syntax. A Python programmer's intuition is informed by existing syntax and conventions in the language. When they see a new construct, its similarity to existing constructs can make understanding the new syntax quite intuitive indeed.
Take for example list comprehensions. Python 1 programmers hadn't seen
a = [x for x in y]
But they knew what square brackets meant in that context, it meant
"creates a new list". And they knew what "for x in y" meant, that meant iteration. Understanding those separate two concepts, a Python 1 programmer would be well on their way to guessing what the new syntax meant--and they'd likely be right. And once they understood list comprehensions, the first time they saw generator expressions and set and dict comprehensions they'd surely intuit what those did immediately.
The non-intuitiveness of PEP 622, as I see it, is that it repurposes
what looks like existing Python syntax but frequently has wholly different semantics. For example, a "class pattern" looks like it's calling a function--perhaps instantiating an object?--but the actual semantics and behavior is very different. Similarly, a "mapping pattern" looks like it's instantiating a dict, but it does something very different, and has unfamiliar and seemingly arbitrary rules about what is permitted, e.g. you can't use full expressions or undotted-identifiers when defining a key. Add the "capture pattern" to both of these, and a Python programmer's intuition about what this syntax traditionally does will be of little help when encountering a PEP 622 match statement for the first time.
Cheers,
/arry
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/2VRPDW4E...
Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Technical Principal at ThoughtWorks | Twitter: @ramalhoorg _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PI44AV5C... Code of Conduct: http://python.org/psf/codeofconduct/
`np` analogue is quite a stretch and far-fetched, really.
I don't disagree. But `_` is a valid identifier so it shouldn't be special. The solution is incredibly simple: allow repeated identifiers just like in assignment so there is no need for a special wildcard symbol.
...<INTUITION ARGUMENT>...
I disagree but this is philosophical discussion which I would rather not go down.
...<POINT ABOUT FUNCTION PARAMETERS>...
That's reasonable, although I would argue that a match is more like a for loop (which allows arbitrary assignment) than a function definition. I do understand your point though. However, I still think the inability to match against a constant not in a namespace is very annoying and could be overcome with some explicit syntax. It's reasonable for this syntax to only allow NAME to be bound (many places in the grammar do this, def, class, as, ...) but I haven't seen a satisfactory reason why there *shouldn't* be support for NAME constants; just reasons for why it's tricky. You (or one of the authors) argue against `.constant` as a matching syntax by saying: "..., it introduces strange-looking new syntax without making the pattern syntax any more expressive." but this is obviously false. It clearly does make the syntax more expressive, it allows one to express something naturally without needing to create an auxiliary structure (or the match syntax doesn't make the language more expressive either). Yes namespaces are a great idea but as a consenting adult I should be free to have constants that are not in a namespace. My constant may come in any number of forms that are not conducive to simply "wrapping it in a namespace", for example it may be used in other modules (so wrapping it would require external changes) or it might be a closure variable and so would require some explicit wrapping step. Further, you argue elsewhere that the we shouldn't worry about a syntax being strange because it's new so of course it's going to be different. Yet for invoke this reasoning as a way to reject solutions to the NAME constant issue. Pick one. (preferably the version that includes match syntax and some strange new syntax for explicit constants because I super want the match syntax). Caleb Donovick On Fri, Jul 31, 2020 at 12:40 AM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
Hi Caleb,
I will only answer to the second part, as the wildcard issue has been brought up and discussed time and again, and the `np` analogue is quite a stretch and far-fetched, really.
One thing that stood out a bit to me as I feel to have seen it a couple of times is the question of intuition, so I will add a few more general thoughts to that...
[...] but it seems quite unintuitive to me [...]
[...] don't necessarily make it intuitively clear [...]
Intuition (or lack thereof) has already been brought forward as an argument a couple of times. I would just like to briefly point out that there is no such thing as universal intuition in the field of programming. We all have different training, skills, preferences and experiences, which make up what we call 'intuition'. But what is intuitive is usually something completely different to C-programmer than to a Haskell- or Lisp-Programmer, say. And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
As for beginners, virtually every part of programming is unintuitive at first. Even something innocuous-looking like assignment is often reason for confusion because `3 + 4 = x` would probably be more 'intuitive'. But there is good reason with regards to the bigger picture to stick to `x = 3 + 4`.
A Python-programmer (at any level) not familiar with pattern matching will most likely not understand all subtlety of the syntax---but this is alos true of features like `async` or the `/` in parameters, say. I would argue, though, that the clear choice of keywords allow anyone to quickly look pattern matching up and get informed on what it does. So, we do not need to come with something that is entirely 'intuitive' and 'self-evident'. But by sticking to common convention like `_` as wildcard, we can help quickly build the correct intuition.
In your examples, for instance, it is perfectly obvious to me that you cannot directly assign to attributes and it would in fact look very weird to my eyes if you could. Your use case is quite similar to initialisers and you are arguing that you would like being able to write: ``` *class* Point: *def* __init__(self, self.x, self.y): *pass* ``` rather than the more verbose: ``` *class* Point: *def* __init__(self, x, y): self.x, self.y = x, y ``` I do not think that this would be a good idea for either parameters or patterns. After all, pattern matching is **not** assignment, even though it is related to it, of course.
Kind regards, Tobias
Quoting Caleb Donovick <donovick@cs.stanford.edu>:
Adding this feature would be a giant quality of life improvement for me and I really hope it succeeds. So I have been trying to keep up on the debate in this and related thread.
While I do want this feature, I agree with a lot of the issues people are raising.
First I agree that _ should not be the wildcard symbol. Or rather the hobgoblins in my mind think that if _ is to be the wildcard symbol it would be more consistent with assignment if it was bound to the last value it was matched with (as should other repeated identifiers) e.g.,
match pt: case (x, _, _): assert _ == pt[2]
I understand the authors rationalize the decision based on conventions with the gettext module. I find these arguments very unconvincing. It's like saying the identifier np should be forbidden from appearing in cases because it's frequently used as the name of numpy. If there is to be a wildcard symbol (that does not bind and is repeatable) it should not be a valid identifier.
Second, the distinction between a capture and constant value pattern should be more explicit. I don't have any great insight into the color of the shed beyond the numerous suggestions already made (name=, ?name, etc...), but it seems quite unintuitive to me that I can't capture into a namespace nor match a constant without a namespace. It is also unclear to me why it would be so terrible to add a new token or abuse an existing one.
Honestly, it doesn't help the case for `?` that it's been proposed as a mark for both capture patterns and value patterns (by different people, obviously :-).
I agree that most of the proposed sheds don't necessarily make it intuitively clear what is a capture variable vs what is a constant. However they do give the programmer the ability to choose.
For example if I want to modify the motivating example from the PEP slightly to copy attributes from one point to another I can't express it concisely:
def update_point_3d(p: Point3d, pt): match pt: case (x, y): p.x, p.y = x, y case Point2d(x, y): p.x, p.y = x, y ...
(Okay I could have just called the original make_point_3d and unpacked the results but it would require the creation of an unnecessary temporary.)
However if the capture was explicit and any valid target could be used as a capture variable then I could express this cleanly:
def update_point_3d(p: Point3d, pt): match pt: case (p.x=, p.y=): pass case Point2d(p.x=, p.y=): pass ...
- Caleb Donovick
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YEHZ4W6C... Code of Conduct: http://python.org/psf/codeofconduct/
Your point about wanting a way to use an unqualified name as a value pattern is not unreasonable, and as you may recall we had an elegant solution in version 1 of the PEP: a leading dot. However that was booed away by the critics, and there has been no consensus (not even close) on what to do instead. Any solution that involves special markup (like bringing back the leading dot, or backticks, or a question mark, or any other sigil) can easily be added in a future version of Python. There is one solution that I personally find acceptable but which found little support from the other PEP authors. It is a rule also adopted by Scala. This is to make it so that any identifier starting with a capital letter (possibly preceded by one or more underscores) is a value pattern. I note that in Scala, too, this is different in patterns than elsewhere in the language: Scala, like Python, allows identifiers starting with a capital letter to be assigned in other contexts -- just not in patterns. It also uses roughly the same *conventions* for naming things as PEP 8 (classes Capitalized, constants UPPERCASE, variables and methods lowercase). I also note that Scala allows backticks as another way to force interpretation as a value pattern (though apparently it's not used much for this purpose). Finally I note that some human languages don't distinguish between lowercase and uppercase (IIUC the CJK languages fall in this category). I don't know what conventions users writing Python using identifiers in their native language use to distinguish between constants and variables, but I do know that they still use the Latin alphabet for keywords, builtins, standard library names, and many 3rd party library names. This is why I gave my proposed rule as "starting with a capital letter" and not as "not starting with a lowercase letter", so that `case こんにちは:` will bind the name こんにちは instead of looking up that name; these seem the more useful semantics. If a Japanese user wanted to look up that name they could write `HELLO = こんにちは` followed by `case HELLO:`. (Of course, Latin-using users can do the same thing if they have a name starting with a lowercase letter that they want to use as a value pattern.) Unfortunately we cannot leave the Capitalized rule to a future version of Python, since the PEP as written interprets `case HELLO:` as a capture pattern. (A compromise would be to disallow Capitalized identifiers altogether, leaving the door open for a decision either way in the future. But in that case I'd rather press for just instituting the rule now.) About `_` enough has been written already. On Sun, Aug 2, 2020 at 6:44 PM Caleb Donovick <donovick@cs.stanford.edu> wrote:
`np` analogue is quite a stretch and far-fetched, really.
I don't disagree. But `_` is a valid identifier so it shouldn't be special. The solution is incredibly simple: allow repeated identifiers just like in assignment so there is no need for a special wildcard symbol.
...<INTUITION ARGUMENT>...
I disagree but this is philosophical discussion which I would rather not go down.
...<POINT ABOUT FUNCTION PARAMETERS>...
That's reasonable, although I would argue that a match is more like a for loop (which allows arbitrary assignment) than a function definition. I do understand your point though.
However, I still think the inability to match against a constant not in a namespace is very annoying and could be overcome with some explicit syntax. It's reasonable for this syntax to only allow NAME to be bound (many places in the grammar do this, def, class, as, ...) but I haven't seen a satisfactory reason why there *shouldn't* be support for NAME constants; just reasons for why it's tricky.
You (or one of the authors) argue against `.constant` as a matching syntax by saying: "..., it introduces strange-looking new syntax without making the pattern syntax any more expressive." but this is obviously false. It clearly does make the syntax more expressive, it allows one to express something naturally without needing to create an auxiliary structure (or the match syntax doesn't make the language more expressive either). Yes namespaces are a great idea but as a consenting adult I should be free to have constants that are not in a namespace. My constant may come in any number of forms that are not conducive to simply "wrapping it in a namespace", for example it may be used in other modules (so wrapping it would require external changes) or it might be a closure variable and so would require some explicit wrapping step.
Further, you argue elsewhere that the we shouldn't worry about a syntax being strange because it's new so of course it's going to be different. Yet for invoke this reasoning as a way to reject solutions to the NAME constant issue. Pick one. (preferably the version that includes match syntax and some strange new syntax for explicit constants because I super want the match syntax).
Caleb Donovick
On Fri, Jul 31, 2020 at 12:40 AM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
Hi Caleb,
I will only answer to the second part, as the wildcard issue has been brought up and discussed time and again, and the `np` analogue is quite a stretch and far-fetched, really.
One thing that stood out a bit to me as I feel to have seen it a couple of times is the question of intuition, so I will add a few more general thoughts to that...
[...] but it seems quite unintuitive to me [...]
[...] don't necessarily make it intuitively clear [...]
Intuition (or lack thereof) has already been brought forward as an argument a couple of times. I would just like to briefly point out that there is no such thing as universal intuition in the field of programming. We all have different training, skills, preferences and experiences, which make up what we call 'intuition'. But what is intuitive is usually something completely different to C-programmer than to a Haskell- or Lisp-Programmer, say. And since pattern matching is really a new feature to be introduced to Python, a feature that can be seen in different lights, there is no 'Python-Programmer intuition' that would apply in this case.
As for beginners, virtually every part of programming is unintuitive at first. Even something innocuous-looking like assignment is often reason for confusion because `3 + 4 = x` would probably be more 'intuitive'. But there is good reason with regards to the bigger picture to stick to `x = 3 + 4`.
A Python-programmer (at any level) not familiar with pattern matching will most likely not understand all subtlety of the syntax---but this is alos true of features like `async` or the `/` in parameters, say. I would argue, though, that the clear choice of keywords allow anyone to quickly look pattern matching up and get informed on what it does. So, we do not need to come with something that is entirely 'intuitive' and 'self-evident'. But by sticking to common convention like `_` as wildcard, we can help quickly build the correct intuition.
In your examples, for instance, it is perfectly obvious to me that you cannot directly assign to attributes and it would in fact look very weird to my eyes if you could. Your use case is quite similar to initialisers and you are arguing that you would like being able to write: ``` *class* Point: *def* __init__(self, self.x, self.y): *pass* ``` rather than the more verbose: ``` *class* Point: *def* __init__(self, x, y): self.x, self.y = x, y ``` I do not think that this would be a good idea for either parameters or patterns. After all, pattern matching is **not** assignment, even though it is related to it, of course.
Kind regards, Tobias
Quoting Caleb Donovick <donovick@cs.stanford.edu>:
Adding this feature would be a giant quality of life improvement for me and I really hope it succeeds. So I have been trying to keep up on the debate in this and related thread.
While I do want this feature, I agree with a lot of the issues people are raising.
First I agree that _ should not be the wildcard symbol. Or rather the hobgoblins in my mind think that if _ is to be the wildcard symbol it would be more consistent with assignment if it was bound to the last value it was matched with (as should other repeated identifiers) e.g.,
match pt: case (x, _, _): assert _ == pt[2]
I understand the authors rationalize the decision based on conventions with the gettext module. I find these arguments very unconvincing. It's like saying the identifier np should be forbidden from appearing in cases because it's frequently used as the name of numpy. If there is to be a wildcard symbol (that does not bind and is repeatable) it should not be a valid identifier.
Second, the distinction between a capture and constant value pattern should be more explicit. I don't have any great insight into the color of the shed beyond the numerous suggestions already made (name=, ?name, etc...), but it seems quite unintuitive to me that I can't capture into a namespace nor match a constant without a namespace. It is also unclear to me why it would be so terrible to add a new token or abuse an existing one.
Honestly, it doesn't help the case for `?` that it's been proposed as a mark for both capture patterns and value patterns (by different people, obviously :-).
I agree that most of the proposed sheds don't necessarily make it intuitively clear what is a capture variable vs what is a constant. However they do give the programmer the ability to choose.
For example if I want to modify the motivating example from the PEP slightly to copy attributes from one point to another I can't express it concisely:
def update_point_3d(p: Point3d, pt): match pt: case (x, y): p.x, p.y = x, y case Point2d(x, y): p.x, p.y = x, y ...
(Okay I could have just called the original make_point_3d and unpacked the results but it would require the creation of an unnecessary temporary.)
However if the capture was explicit and any valid target could be used as a capture variable then I could express this cleanly:
def update_point_3d(p: Point3d, pt): match pt: case (p.x=, p.y=): pass case Point2d(p.x=, p.y=): pass ...
- Caleb Donovick
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YEHZ4W6C... Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RQDWPDYU... Code of Conduct: http://python.org/psf/codeofconduct/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 2020-08-03 05:21, Guido van Rossum wrote:
Your point about wanting a way to use an unqualified name as a value pattern is not unreasonable, and as you may recall we had an elegant solution in version 1 of the PEP: a leading dot. However that was booed away by the critics, and there has been no consensus (not even close) on what to do instead.
Any solution that involves special markup (like bringing back the leading dot, or backticks, or a question mark, or any other sigil) can easily be added in a future version of Python.
There is one solution that I personally find acceptable but which found little support from the other PEP authors. It is a rule also adopted by Scala. This is to make it so that any identifier starting with a capital letter (possibly preceded by one or more underscores) is a value pattern. I note that in Scala, too, this is different in patterns than elsewhere in the language: Scala, like Python, allows identifiers starting with a capital letter to be assigned in other contexts -- just not in patterns. It also uses roughly the same *conventions* for naming things as PEP 8 (classes Capitalized, constants UPPERCASE, variables and methods lowercase). I also note that Scala allows backticks as another way to force interpretation as a value pattern (though apparently it's not used much for this purpose).
[snip] A thought occurred to me. By default, the current rules of the PEP could apply, but why not allow prefixing with "as" for a capture and "is" for a value? Yes, I know, comparison of the values is not by identity, but "is" is a short keyword that already exists and matches up with "as". (After looking back through the thread it looks like Rob Cliffe has already had the same idea.)
On 03/08/2020 17:37, MRAB wrote:
[snip] A thought occurred to me. By default, the current rules of the PEP could apply, but why not allow prefixing with "as" for a capture and "is" for a value?
Yes, I know, comparison of the values is not by identity, but "is" is a short keyword that already exists and matches up with "as".
What about 'match'? Not as short, but fairly intuitive: case (x, y, match Z): print(f'A point whose z-coordinate equals {Z}')
participants (49)
-
Antoine Pitrou
-
Barry Warsaw
-
Brandt Bucher
-
Brett Cannon
-
Caleb Donovick
-
Chris Angelico
-
Daniel Moisset
-
Devin Jeanpierre
-
Emily Bowman
-
emmanuel.coirier@caissedesdepots.fr
-
Eric V. Smith
-
Ethan Furman
-
Federico Salerno
-
Glenn Linderman
-
Greg Ewing
-
Guido van Rossum
-
Gustavo Carneiro
-
Henk-Jaap Wagenaar
-
Ivan Pozdeev
-
Jakub Stasiak
-
Jim Baker
-
Jim J. Jewett
-
Joao S. O. Bueno
-
Koos Zevenhoven
-
Larry Hastings
-
Luciano Ramalho
-
Mark Shannon
-
Michael Lee
-
MRAB
-
Nick Coghlan
-
Paul Moore
-
Paul Sokolovsky
-
Paul Svensson
-
Rhodri James
-
Richard Damon
-
Rik de Kort
-
Rob Cliffe
-
Robert White
-
robin@reportlab.com
-
Stefano Borini
-
Stephen J. Turnbull
-
Steve Dower
-
Steve Holden
-
Steven D'Aprano
-
Terry Reedy
-
Tim Golden
-
Tim Peters
-
Tobias Kohn
-
Yury Selivanov