On 2020-11-15 at 19:11:15 +1100, Steven D'Aprano <steve@pearwood.info> wrote:
On Sat, Nov 14, 2020 at 10:17:34PM -0800, Guido van Rossum wrote:
It’s a usability issue; mappings are used quite differently than sequences. Compare to class patterns rather than sequence patterns.
I'm keeping an open mind on this question, but I think David is right to raise it. I think that most people are going to see this as dict matching as "ignoring errors by default" and going against the Zen of Python, and I expect that we'll be answering questions about it for years to come.
"Why did my match statement match the wrong case?"
Naively, I too would expect that dicts should only match if the keys match with no left overs, and I would like to see the choice to ignore left overs justified in the PEP.
It would be good if the PEP gave a survey of the practical experience of other languages with pattern matching:
- are there languages which require an exact match, with no left over keys? what issues, if any, do users have with that choice?
- which languages ignore extra keys? do users of those languages consider this feature a bug, a wart, or a feature?
In Erlang, "mappings" tend to be generalized collections of (key, value) pairs in which the keys are not nailed down at design time, or the keys evolve over time (think about adding a new field to an existing message). Pattern matching ignores extra keys, so that old code can continue to handle the messages it knows how to handle and simply ignore data it doesn't know about (yes, you have to think carefully about extending messages in this way, but it has worked well over decades). This is definitely a feature. Also in Erlang, "records" are very similar to Python's named tuples. Pattern matching on records also ignores extra keys, so that I can match records that meet certain criteria and not have to list every attribute in every pattern. IMO, ignoring extra keys allows for extensibility when you don't always have control over which versions of which code is actually running (which is the case in the typical distributed system). Not ignoring extra keys may work better inside a monolithic application where all the data comes from within or is already parsed/decoded. IMO, this is going to come down to your use case (I'm *shocked*). If I receive HTML/XML/JSON/TCP/whatever messages, and I want to use pattern matching to decode or dispatch on the message type (e.g., login, logout, attack, connect), then *not* having to write **rest on every pattern reduces clutter. But if I have to handle 2D points separately from 3D points, then a more strict matching (i.e., not ignoring extra keys) relieves me of having to think about which case is more or less specific, and may be easier for beginners to use. (IMO, making things easier for beginners is only a means to an end. If making things easier for beginners makes things harder for experts, then don't do it. But I'm not in charge around here.) As an analogy, when you write a command line utility, do you accept or reject extraneous command line arguments? Is "spam --version ham eggs" the same as "spam --version"? I'm going to guess that it depends on your personality and your background and not anything else inside the utility (note that your choice of command line parser also depends on your personality and your backround...). IOW, the answer to the question of ignoring extra keys is going to aggravate half the users and half the use cases no matter what.