[Python-Dev] Re: PEP 642: Constraint Pattern Syntax for Structural Pattern Matching

2 Nov 2020


      On Mon, Nov 2, 2020 at 1:14 PM Eric V. Smith  wrote:
...
On 11/2/2020 9:31 AM, Thomas Wouters wrote:
On Sat, Oct 31, 2020 at 12:25 PM Steven D'Aprano 
wrote:
...
I really don't get why so many people are hung up over this minuscule
issue of giving `_` special meaning inside match statements. IMO,
consistency with other languages' pattern matching is more useful than
the ability to capture using `_` as a variable name.
Allow me to explain, then: structured pattern matching is (even by
admission of PEPs 634-363) an extension of iterable unpacking. The use of
'_' as a wildcard pattern is a sharp break in that extension. In the
structured pattern matching proposal, '_' is special syntax (and not in any
way less so than '?') but *only* in cases in match statements, not in
iterable unpacking. It *already* isn't consistent with '_' in other
languages, and we can't fix that without breaking uses of _ for gettext,
not to mention other situations existing code uses '_' as something other
than an assign-only variable.
Using '_' in structured pattern matching means any use of '_' becomes an
extra burden -- you have to know whether it's a name or not based on the
surrounding context. It makes all uses of '_' harder to parse, and it makes
it easier to mistake one situation for another. Perhaps not terribly easy,
but since there is _no_ confusion now, it's by definition *easier*. The use
of something else, like '?', leaves existing uses of '_' unambiguous, and
allows structured pattern matching and iterable unpacking to be thought of
the same. It reduces the complexity of the language because it no longer
uses the same syntax for disparate things.
All good points.
What I don't understand is why '_' is treated any differently than any
named capture pattern. It seems to me that using:
case x:    # a capture_pattern
is the same as:
case _:  # the wildcard_pattern
They both always match (I'm ignoring the binding thing here, it's coming
up). I realize PEP 635 gives the rational for separating this so that it
can enforce that "case x, x:" can be made invalid, likening it to duplicate
function parameters. The PEP focuses on the differences between that and
tuple unpacking. But I think that if the semantics were the same as tuple
unpacking (allowed duplicates, and binding to the last one) then the whole
"_ as wildcard" arguments would just go away, and "_" would be treated just
as it is elsewhere in Python. For me, this would address Thomas' point
above and reduce the cognitive load of having a special rule.
But I'm probably missing some other nuance to the whole discussion, which
will no doubt now be pointed out to me.
Eric
That's not an unreasonable characterization. But we feel that `case x, x`
can easily be misunderstood as "a tuple of two equal values" and we want to
be able to call that out as an error. Hence the need for recognizing the
wildcard in the parser, since `case x, _, _` *is* important. Hence the need
to standardize it (i.e., not leave it to be *just* a convention). Using _
seems the most commonly used convention for "throwaway" target (although we
know some organizations have different conventions), *and* it matches the
wildcard notation in most other languages, which looks like a win-win to
me. Finally, not assigning a value to _ is kind of important in the context
of i18n, where _("string") is the common convention for tagging
translatable strings.

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...