Regarding the syntax for wildcards and OR patterns, the PEP explains
why `_` and `|` are the best choices here: no other language surveyed
uses anything but `_` for wildcards, and the vast majority uses `|`
for OR patterns. A similar argument applies to class patterns.
In that case, I'd like to make a specific pitch for "don't make
'_' special". (I'm going to spell it '_' as it seems to be easier
to read this way; ignore the quotes.)
IIUC '_' is special in two ways:
1) we permit it to be used more than once in a single pattern, and
2) if it matches, it isn't bound.
If we forego these two exceptions, '_' can go back to behaving
like any other identifier. It becomes an idiom rather than a
special case.
Drilling down on what we'd need to change:
To address 1), allow using a name multiple times in a single pattern.
622 v2 already says:
For the moment, we decided to make repeated use of names within the same pattern an error; we can always relax this restriction later without affecting backwards compatibility.
If we relax it now, then we don't need '_' to be special in this
way. All in all this part seems surprisingly uncontentious.
To address 2), bind '_' when it's used as a name in a pattern.
This adds an extra reference and an extra store. That by itself
seems harmless.
The existing implementation has optimizations here. If that's
important, we could achieve the same result with a little dataflow
analysis to optimize away the dead store. We could even
special-case optimizing away dead stores only to '_' and only
in match/case statements and all would be forgiven.
Folks point out that I18N code frequently uses a global function named '_'. The collision of these two uses is unfortunate, but I think it's survivable. I certainly don't think this collision means we should special-case this one identifier in this one context in the language specification.
Consider:
One consideration: if you do use '_' multiple times in a single pattern, and you do refer to its value afterwards, what value should it get? Consider that Python already permits multiple assignments in a single expression:
(x:="first", x:="middle", x:="last")
After this expression is evaluated, x has been bound to the value
"last". I could live with "it keeps the rightmost". I could also
live with "the result is implementation-defined". I suspect it
doesn't matter much, because the point of the idiom is that people
don't care about the value.
In keeping with this change, I additionally propose removing '*_'
as a special token. '*_' would behave like any other
'*identifier', binding the value to the unpacked sequence.
Alternately, we could keep the special token but change it to '*'
so it mirrors Python function declaration syntax. I don't have a
strong opinion about this second alternative.
Cheers,
/arry