My thoughts on Pattern Matching.
Apologies that this is a long email, but I want to make sure I get my points across and it's difficult to do it in a short email. I touched on some of these things in a blogpost I wrote ( https://pyrandom.blogspot.com/2020/11/my-view-of-python-steering-council.htm...) but I wanted to make the main points in detail in a way that made it easy for people to provide feedback if they wanted, so it's an email instead of a blog post. We haven't broadly announced it, but the current SC has decided not to make the final decision <https://github.com/python/steering-council/issues/39#issuecomment-720677712> on Structural Pattern Matching (PEPs 634/635/636) and the additional/competing PEPs (640 and 642), because it's so close to the election of the next SC. Instead, we're going to make our decision and then leave it as a strong recommendation for the 2021 SC, as they will actually be in charge for the release of the next version of Python. The 2021 SC will then have to make the final decision. I understand this may be disappointing to some, and I'm sorry, but even after the SC election there's more than five months before the final decision has to be made for the feature to make it into Python 3.10. I hope the next SC doesn't take that long, but there's no rush to get it in before they are seated. I also imagine this means the Structural Pattern Matching proposal may be a voting issue for some, and since I will be running for the next SC, I want to make it clear where I stand and why. Note that this is *my* point of view, not the SC's. I don't think any of the SC have made up their mind yet, and we all approach this differently. We're continuing the discussion between us to come to our decision-cum-recommendation. I would also like to hear (in private or public, I don't mind) whether people even remotely agree with me, because my mind can still be changed on some of these issues -- especially if it turns out I'm alone in my concerns. First of all, I am honestly excited about pattern matching. The impact on existing code or the examples in the PEPs may not feel impressive, but I can see how it would change how we think about certain problems, how we design APIs around them, much like for example decorators and the `with` statement has in the past. Like both of those changes, the pattern matching design gives us convenient inversion of control. I don't usually write code using isinstance(), but I realise switching on types is something people want to do and sometimes might even be the best thing to do. The Structural Pattern Matching proposal solves this by allowing the types to decide what they match against in a way that's easier to use and *much* more purpose-driven than the existing hammers of `__isinstance__` and `__issubclass__`. My concerns about the proposal all stem from the differences with the rest of the language. Some are very minor -- for example, I wish there was a better way to deal with the indentation of 'match' and 'case', but all of the proposed alternatives are clearly worse to me, so the double indentation is fine. I'm satisfied all my minor concerns have been addressed in the Structural Pattern Matching proposal, one way or another. I also have some small but significant concerns that keep nagging at me, but I'm not sure how much of a big deal they will be in practice. I don't know whether these will be a problem, but I'm worried they _might turn out to be_, and that in hindsight we should've made a different choice. Ideally, we make decisions that allow us to correct the mistake (like removing u'' strings in Python 3) rather than make it hard to ever change (like indexing bytes producing integers). Those concerns are (although I may have forgotten one or two minor ones): - Mapping patterns ignoring extra items. In the proposal, unpacking "into" a dict automatically ignores extra items.There is syntax readily available (and explicitly disallowed) for ignoring extra items. There are also ways to check for extra keys, like a guard on the case, but if the program doesn't think about adding it, extra items are just silently ignored. I'm just not sure that the don't-care-about-extra-items use-case is common enough to warrant silently doing the wrong thing, especially since it's so very easy to write the code for that use-case. - The mixture of assignment, evaluation and matching in case clauses. I keep worrying that it will be difficult to see what's being assigned to, what's being called, and what's being matched against. I think the current proposal has enough safe-guards to make it hard to *write* the wrong thing, but readability still counts. The competing proposals for solving this issue do not seem like improvements to me, however, for reasons I'll get to below. - The use of `|` instead of `or`, which falls in the same boat as the previous point: to my eyes `or` makes it much easier to parse the more complex cases in the examples. For the simple case of 'case 0|1:' versus 'case 0 or 1:' it's not as big a deal, but it's not like `|` is somehow better, either. My primary hesitation, however, is much wider than those issues. It's the incongruity between pattern matching and the rest of the language. And it's not just because it makes it harder to teach, or harder to read or maintain, or require some kind of mental mode switch as you go through the code -- I *am* worried about those things, but to me they are not the main problem. The primary reason I care about the integration with the rest of Python is because it limits the future expansion of the language. Pattern matching can be (and frequently is) seen to be an expansion of iterable unpacking -- or, depending on your point of view, iterable unpacking is a basic, limited form of pattern matching. Making the two _unnecessarily_ different makes it harder to unify them, and makes it harder to unify _future_ changes to either of them. For example, if pattern matching indeed takes off and we start seeing APIs and conventions designed around it, we may want to consider (or reconsider) adding mapping-unpacking to iterable unpacking as well: `{'spam': spam_var, 'ham': ham_var} = ...`. Or we may want simplified typed unpacking if we find it's very common to write short match expressions, and instead write `Viking(spam, ham) = ...`. We won't know which of those -- or of things we can't imagine right now -- will make sense until after pattern matching is accepted and has seen plenty of use. We've seen that with `with`, with decorators, with generators, with generator-style coroutines, etc. And it also goes the other way: there may be changes to other parts of Python that we want to make work with pattern matching, even the bits of pattern matching that aren't exactly like the rest of Python -- we may end up expanding the assignment syntax to add some feature, and then find we want that feature in match cases, which aren't exactly like other assignment targets. To me, that's the big reason to make sure any new proposal lines up well with the rest of Python. In the Structural Pattern Matching proposal, as I see it, the main incongruities are the big expansion of assignment (which is part and parcel to pattern matching) with its subtle differences between stores and loads, and the special semantics for `_`. It doesn't matter that we can live with those differences; what matters is what's going to happen in the long run with those differences. If we want to copy some of the semantics and syntax from pattern matching to other parts of Python, can we? We can do that with the expanded assignment targets to some extent, perhaps even to the fullest sensible extent. We can't really do that with the special semantics of `_`. It would mean some significant confusion around the use of _ for i18n, even if we figured out a way of handling the conflict. This is why I proposed PEP 640. By using '?' -- or whatever else that isn't a valid identifier -- we avoid an insurmountable problem in the future, and by applying it to iterable unpacking we're already closing the gap. The entire cost of this proposal is using something other than what other languages use, which is hardly unique in Python. But failing that, I want to see _some_ way towards closing the gap between pattern matching and existing assignment semantics. If `_` becomes special in pattern matching, we should have a long-term plan for making it special everywhere, or we're left with a gap we can only ever widen, not close. It's also why I'm not in favour of PEP 642 and other proposals for solving some of the problems in the Structural Pattern Matching proposal (sigils, etc): it widens the gap instead of closing it. It creates more special syntax that will be harder to apply to other parts of Python. I think we need something that fits well with the rest of Python, not a separate sub-language. Like I said, I haven't made up my mind either way, and I don't want to turn this into the single issue on which to vote _for_ me -- the SC has a lot of other, important stuff on its plate, and you can disagree with me on any of them! -- but I want to be entirely clear on where I stand on this issue, lest people _regret_ voting for me because of it. -- Thomas Wouters <thomas@python.org> Hi! I'm an email virus! Think twice before sending your email to help me spread!
On Fri, Nov 6, 2020 at 7:05 AM Thomas Wouters <thomas@python.org> wrote:
The primary reason I care about the integration with the rest of Python is because it limits the future expansion of the language.
I did not think as deeply as you have done on this subject here. My exposure to pattern matching was in Scala and I didn't notice/observe that this feature was considered a limitation in future expansion or language or even usage in ecosystem. Also, for the examples that you mentioned, I thought, those would be an _extreme cases_ (?) of writing some really hard to comprehend code by the developer? Because in most common cases in Scala that I could come across, the pattern matching was used mostly in routing logic based on the decision. Seldom on complex assignments. But, my exposure is limited. If you augment your arguments by sharing some example code base evolutions in languages that have already supported pattern matching, it might help us understand further. Thanks for sharing the context of this email with SC standings. Thank you, Senthil
On 7/11/20 4:03 am, Thomas Wouters wrote:
It's also why I'm not in favour of PEP 642 and other proposals for solving some of the problems in the Structural Pattern Matching proposal (sigils, etc): it widens the gap instead of closing it.
Does that mean you're against *any* proposal that involves sigils, or just PEP 642 in particular? Also, I'm very confused about why you're against PEP 642. It seems to do a good job of meeting your stated goals -- syntax in common between unpacking and matching has the same meaning, and the way is left open for making them more like each other in the future. Can you elaborate on what you don't like about it? -- Greg
On Sat., 7 Nov. 2020, 9:56 am Greg Ewing, <greg.ewing@canterbury.ac.nz> wrote:
On 7/11/20 4:03 am, Thomas Wouters wrote:
It's also why I'm not in favour of PEP 642 and other proposals for solving some of the problems in the Structural Pattern Matching proposal (sigils, etc): it widens the gap instead of closing it.
Does that mean you're against *any* proposal that involves sigils, or just PEP 642 in particular?
Also, I'm very confused about why you're against PEP 642. It seems to do a good job of meeting your stated goals -- syntax in common between unpacking and matching has the same meaning, and the way is left open for making them more like each other in the future. Can you elaborate on what you don't like about it?
It seems worth noting that many of Thomas's reservations align with my own about my proposal in PEP 642 (both the original version I published last week and the updated one I just published today). Certainly my *goal* is to address those key concerns (since I share them), but it's an open question whether or not I've actually achieved that (especially now I've conceded the point that keeping match patterns readable is going to require *some* flavour of syntactic shorthand that will never work in regular assignment targets - while PEP 642 proposes defining that in terms of a more explicit syntax that *could* be added to assignment targets, the shorthand forms would still be forever inconsistent). Cheers, Nick. P.S. FWIW, I'll also note that do have a strong pro-"|" opinion on MatchOr patterns (I think trying to read "or" in that position would degenerate into keyword soup, whereas the vertical bar stands out nicely), and have been burned by enough restrictive JSON parsers that collapse when the sender adds a new key to an object to be strongly pro "ignore extra mapping keys by default" in mapping patterns. However, I don't think those kinds of questions are anywhere near as fundamental as the one about whether or not potential syntactic consistency with assignment targets should even be a design goal in the first place.
Hi Thomas, Thank you very much for your carefully worded and thoughtful email. I feel, however, that many of your concerns are based on an idealised picture of a future Python language that will never actually materialise. As I understand it your main point is that the concept of patterns might---or even should---be retro-fitted to general assignments. Just as we have borrowed from and expanded on the idea of iterable unpacking in assignments, so should assignments then pick up the concepts introduced in pattern matching. Indeed, assignments of the form ``Viking(spam, ham) = ...`` are not only appealing but can be found in various languages with pattern matching. So, why would we not want to have consistent concepts and symmetric (orthogonal) structures across all of Python? Unfortunately, such consistency or symmetry comes at a high cost---too high as far as I am concerned. One of the simplest patterns is without doubt the literal pattern that essential only matches itself (e.g., ``case 123:`` or ``case 'abc':``). Any future unification of patterns and assignments would therefore necessarily allow for statement such as:: 1 = x This is essentially equivalent to ``assert(x == 1)``. Indeed, _Elixir_ [1] uses such syntax for exactly this semantics. However, knowing how much novice programmers struggle with understanding that assignments are right-to-left (i.e. ``x = 1`` and not ``1 = x``), including such syntax would immediately raise the learning curve significantly. In short, a very common students' mistake will take them to error messages and concepts they could not possibly understand without a basic comprehension of pattern matching. And this is probably where it becomes most obvious how our views of pattern matching differ. The pattern matching as we propose it in PEPs 634/635/636 is guarded by a keyword needed to activate these features. Unless you start your statement with ``match my_data:``, you can easily pretend as if pattern matching did not exist: it will not, cannot affect your code. This encapsulation is intentional and by design. As far as I am aware, those languages that support syntax like ``Viking(spam, ham) = ...`` only allow this in combination with variable _declaration_, i.e. you actually have to write ``let Viking(spam, ham) = `` or ``var Viking(spam, ham) = ...`` or similar. Without such a marker, this syntax quickly descends into unreadable gibberish. As noted in the original section of rejected ideas in PEP 622, we had originally considered adding 'one-off pattern matching': pattern matching with only a single case that must succeed, very much like normal assignments do. But our approach was always guarded by a keyword, be that ``case`` or ``if``---in line with the ``var`` or ``let`` found in other languages. Even in that case, patterns would not leak into the language outside pattern matching. Finally, there is already a necessary inconsistency between iterable unpacking and pattern matching. By their very nature, patterns express a _possible_ structure, whereas iterable unpacking imposes a _necessary_ structure. So, when dealing with iterators, it is safe to 'unpack' the iterator in iterable unpacking. If the expected and actual structures differ, it is an error, anyway. In pattern matching, however, we have to be more conservative and careful, exploring options rather than certanties. Hence, even if all other concerns were wiped away, the closest we could come to an entirely symmetric and consistent language is one with some subtle differences and thus prone for bugs and errors. PEPs 634/635/636 are the result of a long and careful design process where we sought to appeal to the intuition of the Python programmer as much as possible, without betraying the differences and new concepts that pattern matching necessarily introduces. Readability was always one of our main concerns and we believe that having a clear context where patterns occur greatly helps writing readable and consistent code. So, long story short, I am afraid I would question the very premise upon which your concerns are founded: that it would ever be a good idea to expand patterns to assignments in general. Although such a unification was in principle possible, it would rob Python of one of its greatest and strongest assets: its simplicity and learnability. Kind regards, Tobias [1] https://elixir-lang.org/getting-started/pattern-matching.html
On Mon, Nov 9, 2020 at 10:40 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
Hi Thomas,
Thank you very much for your carefully worded and thoughtful email. I feel, however, that many of your concerns are based on an idealised picture of a future Python language that will never actually materialise.
As I understand it your main point is that the concept of patterns might---or even should---be retro-fitted to general assignments. Just as we have borrowed from and expanded on the idea of iterable unpacking in assignments, so should assignments then pick up the concepts introduced in pattern matching. Indeed, assignments of the form ``Viking(spam, ham) = ...`` are not only appealing but can be found in various languages with pattern matching. So, why would we not want to have consistent concepts and symmetric (orthogonal) structures across all of Python?
Unfortunately, such consistency or symmetry comes at a high cost---too high as far as I am concerned.
One of the simplest patterns is without doubt the literal pattern that essential only matches itself (e.g., ``case 123:`` or ``case 'abc':``). Any future unification of patterns and assignments would therefore necessarily allow for statement such as::
1 = x
I don't think that's true. There isn't anything that "necessarily" *has* to happen; practicality beats purity. Thomas' statement is more about the fact that there are so *many *concepts fully contained within the world of pattern matching which do not apply outside of it. He didn't seem to imply to me that *all* concepts had to be brought out to the general language, just that there were ones that seem to make sense to pull out for assignment unpacking. -Brett
This is essentially equivalent to ``assert(x == 1)``. Indeed, _Elixir_ [1] uses such syntax for exactly this semantics. However, knowing how much novice programmers struggle with understanding that assignments are right-to-left (i.e. ``x = 1`` and not ``1 = x``), including such syntax would immediately raise the learning curve significantly. In short, a very common students' mistake will take them to error messages and concepts they could not possibly understand without a basic comprehension of pattern matching.
And this is probably where it becomes most obvious how our views of pattern matching differ. The pattern matching as we propose it in PEPs 634/635/636 is guarded by a keyword needed to activate these features. Unless you start your statement with ``match my_data:``, you can easily pretend as if pattern matching did not exist: it will not, cannot affect your code. This encapsulation is intentional and by design.
As far as I am aware, those languages that support syntax like ``Viking(spam, ham) = ...`` only allow this in combination with variable _declaration_, i.e. you actually have to write ``let Viking(spam, ham) = `` or ``var Viking(spam, ham) = ...`` or similar. Without such a marker, this syntax quickly descends into unreadable gibberish. As noted in the original section of rejected ideas in PEP 622, we had originally considered adding 'one-off pattern matching': pattern matching with only a single case that must succeed, very much like normal assignments do. But our approach was always guarded by a keyword, be that ``case`` or ``if``---in line with the ``var`` or ``let`` found in other languages. Even in that case, patterns would not leak into the language outside pattern matching.
Finally, there is already a necessary inconsistency between iterable unpacking and pattern matching. By their very nature, patterns express a _possible_ structure, whereas iterable unpacking imposes a _necessary_ structure. So, when dealing with iterators, it is safe to 'unpack' the iterator in iterable unpacking. If the expected and actual structures differ, it is an error, anyway. In pattern matching, however, we have to be more conservative and careful, exploring options rather than certanties. Hence, even if all other concerns were wiped away, the closest we could come to an entirely symmetric and consistent language is one with some subtle differences and thus prone for bugs and errors.
PEPs 634/635/636 are the result of a long and careful design process where we sought to appeal to the intuition of the Python programmer as much as possible, without betraying the differences and new concepts that pattern matching necessarily introduces. Readability was always one of our main concerns and we believe that having a clear context where patterns occur greatly helps writing readable and consistent code.
So, long story short, I am afraid I would question the very premise upon which your concerns are founded: that it would ever be a good idea to expand patterns to assignments in general. Although such a unification was in principle possible, it would rob Python of one of its greatest and strongest assets: its simplicity and learnability.
Kind regards, Tobias
[1] https://elixir-lang.org/getting-started/pattern-matching.html
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KOYU2FAY... Code of Conduct: http://python.org/psf/codeofconduct/
On Wed., 11 Nov. 2020, 8:10 am Brett Cannon, <brett@python.org> wrote:
On Mon, Nov 9, 2020 at 10:40 PM Tobias Kohn <kohnt@tobiaskohn.ch> wrote:
One of the simplest patterns is without doubt the literal pattern that essential only matches itself (e.g., ``case 123:`` or ``case 'abc':``). Any future unification of patterns and assignments would therefore necessarily allow for statement such as::
1 = x
I don't think that's true. There isn't anything that "necessarily" *has* to happen; practicality beats purity. Thomas' statement is more about the fact that there are so *many *concepts fully contained within the world of pattern matching which do not apply outside of it. He didn't seem to imply to me that *all* concepts had to be brought out to the general language, just that there were ones that seem to make sense to pull out for assignment unpacking.
And this is the view that PEP 642 takes as well: that we want to preserve the *option* of lifting aspects of pattern matching out to be more general, rather than committing up front to it always being an all-or-nothing proposition. For my mind, the flavour of assignment statement that I think would be the most plausible follow-up to matching against multiple patterns would take the form "try pattern = subject", and even accepting PEP 634 exactly as written wouldn't lock us out of that specific option. However the inferred constraint patterns would give me even more pause there than they do in the match statement proposals, so it would be nice to have the freedom to disallow them without reducing the overall expressiveness of the construct. There are limits to how much immediate complexity we'd want to incur to preserve that optionality (YAGNI and all that), but the way PEP 634 is currently defined creates *complications* that leak all the way from the parser through to the code generator, rather than building a cleaner conceptual abstraction layer at the AST level. Cheers, Nick.
participants (6)
-
Brett Cannon
-
Greg Ewing
-
Nick Coghlan
-
Senthil Kumaran
-
Thomas Wouters
-
Tobias Kohn