
I'm still only intermittently keeping up on python-dev, but my main concern with the first iteration remains in this version, which is that it doesn't even *mention* that the proposed name binding syntax inherently conflicts with the existing assignment statement lvalue syntax in two areas: * dotted names (binds an attribute in assignment, looks up a constraint value in a match case) * underscore targets (binds in assignment, wildcard match without binding in a match case) The latter could potentially be made internally consistent in the future by redefining "_" and "__" as soft keywords that don't get bound via normal assignment statements either (requiring that they be set via namespace dict modification instead for use cases like il8n). https://www.python.org/dev/peps/pep-0622/#use-some-other-token-as-wildcard presents a reasonable rationale for the usage, so it's only flaw is failing to mention the inconsistency. The former syntactic conflict presents a bigger problem, though, as it means that we'd be irrevocably committed to having two different lvalue syntaxes for the rest of Python's future as a language. https://www.python.org/dev/peps/pep-0622/#alternatives-for-constant-value-pa... is nominally about this problem, but it doesn't even *mention* the single biggest benefit of putting a common prefix on value constraints: it leaves the door open to unifying the lvalue syntax again in the future by keeping the proposed match case syntax a strict superset of the existing assignment target syntax, rather than partially conflicting with it. More incidentally, the latest write-up also leaves out "?" as a suggested constraint value prefix, when that's the single character prefix that best implies the question "Does the runtime value at this position equal the result of this value constraint expression?" without having any other existing semantic implications in Python. Cheers, Nick. P.S. I feel I should mention that the other reason I like "?" as a potential prefix for value constraints is that if we require it for all value constraint expressions (both literals and name lookups) I believe it could offer a way to unblock the None-aware expressions PEP by reframing that PEP as a shorthand for particular case matches. None coalescence ("a ?? b") for example: match a: case ?None: _expr_result = b case _match: _expr_result = _match Or a None-severing attribute lookup ("a?.b"): _match_expr = a match _match_expr: case ?None: _expr_result = _match_expr case _match: _expr_result = _match.b Since these operations would be defined in terms of *equality* (as per PEP 622), rather than identity, it would also allow other sentinels to benefit from the None-aware shorthand by defining themselves as being equal to None. On Thu., 9 Jul. 2020, 1:07 am Guido van Rossum, <guido@python.org> wrote:
Today I’m happy (and a little trepidatious) to announce the next version of PEP 622, Pattern Matching. As authors we welcome Daniel F Moisset in our midst. Daniel wrote a lot of the new text in this version, which introduces the subject matter much more gently than the first version did. He also convinced us to drop the `__match__` protocol for now: the proposal stands quite well without that kind of extensibility, and postponing it will allow us to design it at a later time when we have more experience with how `match` is being used.
That said, the new version does not differ dramatically in what we propose. Apart from dropping `__match__` we’re dropping the leading dot to mark named constants, without a replacement, and everything else looks like we’re digging in our heels. Why is that? Given the firestorm of feedback we received and the numerous proposals (still coming) for alternative syntax, it seems a bad tactic not to give up something more substantial in order to get this proposal passed. Let me explain.
Language design is not like politics. It’s not like mathematics either, but I don’t think this situation is at all similar to negotiating a higher minimum wage in exchange for a lower pension, where you can definitely argue about exactly how much lower/higher you’re willing to go. So I don’t think it’s right to propose making the feature a little bit uglier just to get it accepted.
Frankly, 90% of the issue is about what among the authors we’ve dubbed the “load/store” problem (although Tobias never tires to explain that the “load” part is really “load-and-compare”). There’s a considerable section devoted to this topic in the PEP, but I’d like to give it another try here.
In case you’ve been avoiding python-dev lately, the problem is this. Pattern matching lets you capture values from the subject, similar to sequence unpacking, so that you can write for example ``` x = range(4) match x: case (a, b, *rest): print(f"first={a}, second={b}, rest={rest}") # 0, 1, [2, 3] ``` Here the `case` line captures the contents of the subject `x` in three variables named `a`, `b` and `rest`. This is easy to understand by pretending that a pattern (i.e., what follows `case`) is like the LHS of an assignment.
However, in order to make pattern matching more useful and versatile, the pattern matching syntax also allows using literals instead of capture variables. This is really handy when you want to distinguish different cases based on some value, for example ``` match t: case ("rect", real, imag): return complex(real, imag) case ("polar", r, phi): return complex(r * cos(phi), r * sin(phi)) ``` You might not even notice anything funny here if I didn’t point out that `"rect"` and `"polar"` are literals -- it’s really quite natural for patterns to support this once you think about it.
The problem that everybody’s been concerned about is that Python programmers, like C programmers before them, aren’t too keen to have literals like this all over their code, and would rather give names to the literals, for example ``` USE_POLAR = "polar" USE_RECT = "rect" ``` Now we would like to be able to replace those literals with the corresponding names throughout our code and have everything work like before: ``` match t: case (USE_RECT, real, imag): return complex(real, imag) case (USE_POLAR, r, phi): return complex(r * cos(phi), r * sin(phi)) ``` Alas, the compiler doesn’t know that we want `USE_RECT` to be a constant value to be matched while we intend `real` and `imag` to be variables to be given the corresponding values captured from the subject. So various clever ways have been proposed to distinguish the two cases.
This discussion is not new to the authors: before we ever published the first version of the PEP we vigorously debated this (it is Issue 1 in our tracker!), and other languages before us have also had to come to grips with it. Even many statically compiled languages! The reason is that for reasons of usability it’s usually deemed important that their equivalent of `case` auto-declare the captured variables, and variable declarations may hide (override) like-named variables in outer scopes.
Scala, for example, uses several different rules: first, capture variable names must start with a lowercase letter (so it would handle the above example as intended); next, capture variables cannot be dotted names (like `mod.var`); finally, you can enclose any variable in backticks to force the compiler to see it as a load instead of a store. Elixir uses another form of markup for loads: `x` is a capture variable, but `^x` loads and compares the value of `x`.
There are a number of dead ends when looking for a solution that works for Python. Checking at runtime whether a name is defined or not is one of these: there are numerous reasons why this could be confusing, not the least of which being that the `match` may be executed in a loop and the variable may already be bound by a previous iteration. (True, this has to do with the scope we’ve adopted for capture variables. But believe me, giving each case clause its own scope is quite horrible by itself, and there are other action-at-a-distance effects that are equally bad.)
It’s been proposed to explicitly state the names of the variables bound in a header of the `match` statement; but this doesn’t scale when the number of cases becomes larger, and requires users to do bookkeeping the compiler should be able to do. We’re really looking for a solution that tells you when you’re looking at an individual `case` which variables are captured and which are used for load-and-compare.
Marking up the capture variables with some sigil (e.g. `$x` or `x?`) or other markup (e.g. backticks or `<x>`) makes this common case ugly and inconsistent: it’s unpleasant to see for example ``` case %x, %y: print(x, y) ``` No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`.
We believe it’s acceptable that things looking like `mod.var` are never considered capture variables -- the common use cases for `match` are such that one would almost never want to capture into a different namespace. (Just like you very rarely see `for self.i in …` and never `except E as scope.var` -- the latter is illegal syntax and sets a precedent.)
One author would dearly have seen Scala’s uppercase rule adopted, but in the end was convinced by the others that this was a bad idea, both because there’s no precedent in Python’s syntax, and because many human languages simply don’t make the distinction between lowercase and uppercase in their writing systems.
So what should you do if you have a local variable (say, a function argument) that you want to use as a value in a pattern? One solution is to capture the value in another variable and use a guard to compare that variable to the argument: ``` def foo(x, spam): match x: case Point(p, q, context=c) if c == spam: # Match ``` If this really is a deal-breaker after all other issues have been settled, we could go back to considering some special markup for load-and-compare of simple names (even though we expect this case to be very rare). But there’s no pressing need to decide to do this now -- we can always add new markup for this purpose in a future version, as long as we continue to support dotted names without markup, since that *is* a commonly needed case.
There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question.
Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.
If you've made it so far, here are the links to check out, with an open mind. As a reminder, the introductory sections (Abstract, Overview, and Rationale and Goals) have been entirely rewritten and also serve as introduction and tutorial.
- PEP 622: https://www.python.org/dev/peps/pep-0622/ - Playground: https://mybinder.org/v2/gh/gvanrossum/patma/master?urlpath=lab/tree/playgrou...
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LOXEATGF... Code of Conduct: http://python.org/psf/codeofconduct/