On Mon, Nov 2, 2020 at 1:14 PM Eric V. Smith email@example.com wrote:
On 11/2/2020 9:31 AM, Thomas Wouters wrote:
On Sat, Oct 31, 2020 at 12:25 PM Steven D'Aprano firstname.lastname@example.org wrote:
I really don't get why so many people are hung up over this minuscule issue of giving `_` special meaning inside match statements. IMO, consistency with other languages' pattern matching is more useful than the ability to capture using `_` as a variable name.
Allow me to explain, then: structured pattern matching is (even by admission of PEPs 634-363) an extension of iterable unpacking. The use of '_' as a wildcard pattern is a sharp break in that extension. In the structured pattern matching proposal, '_' is special syntax (and not in any way less so than '?') but *only* in cases in match statements, not in iterable unpacking. It *already* isn't consistent with '_' in other languages, and we can't fix that without breaking uses of _ for gettext, not to mention other situations existing code uses '_' as something other than an assign-only variable.
Using '_' in structured pattern matching means any use of '_' becomes an extra burden -- you have to know whether it's a name or not based on the surrounding context. It makes all uses of '_' harder to parse, and it makes it easier to mistake one situation for another. Perhaps not terribly easy, but since there is _no_ confusion now, it's by definition *easier*. The use of something else, like '?', leaves existing uses of '_' unambiguous, and allows structured pattern matching and iterable unpacking to be thought of the same. It reduces the complexity of the language because it no longer uses the same syntax for disparate things.
All good points.
What I don't understand is why '_' is treated any differently than any named capture pattern. It seems to me that using:
case x: # a capture_pattern
is the same as:
case _: # the wildcard_pattern
They both always match (I'm ignoring the binding thing here, it's coming up). I realize PEP 635 gives the rational for separating this so that it can enforce that "case x, x:" can be made invalid, likening it to duplicate function parameters. The PEP focuses on the differences between that and tuple unpacking. But I think that if the semantics were the same as tuple unpacking (allowed duplicates, and binding to the last one) then the whole "_ as wildcard" arguments would just go away, and "_" would be treated just as it is elsewhere in Python. For me, this would address Thomas' point above and reduce the cognitive load of having a special rule.
But I'm probably missing some other nuance to the whole discussion, which will no doubt now be pointed out to me.
That's not an unreasonable characterization. But we feel that `case x, x` can easily be misunderstood as "a tuple of two equal values" and we want to be able to call that out as an error. Hence the need for recognizing the wildcard in the parser, since `case x, _, _` *is* important. Hence the need to standardize it (i.e., not leave it to be *just* a convention). Using _ seems the most commonly used convention for "throwaway" target (although we know some organizations have different conventions), *and* it matches the wildcard notation in most other languages, which looks like a win-win to me. Finally, not assigning a value to _ is kind of important in the context of i18n, where _("string") is the common convention for tagging translatable strings.