Take as an example a function designed to process a tree of nodes similar to that which might be output by a JSON parser. There are 4 types of node:
The function transforms a tree of nodes, beginning at the root node, and proceeding recursively through each child node in turn. The result is a Python object, with the following transformation applied to each node type:
->
Python str
->
Python float
->
Python list
->
Python dict
I have implemented this function using 3 different approaches:
isinstance
checks against the node typeHere is the implementation using the visitor pattern:
from typing import List, Tuple
class NodeVisitor:
def visit_string_node(self, node: StringNode):
pass
def visit_number_node(self, node: NumberNode):
pass
def visit_list_node(self, node: ListNode):
pass
def visit_dict_node(self, node: DictNode):
pass
class Node:
def visit(visitor: NodeVisitor):
raise NotImplementedError()
class StringNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_string_node(self)
class NumberNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_number_node(self)
class ListNode(Node):
children: List[Node]
def visit(self, visitor: NodeVisitor):
visitor.visit_list_node(self)
class DictNode(Node):
children: List[Tuple[str, Node]]
def visit(self, visitor: NodeVisitor):
visitor.visit_dict_node(self)
class Processor(NodeVisitor):
def process(root_node: Node):
return root_node.visit(self)
def visit_string_node(self, node: StringNode):
return node.value
def visit_number_node(self, node: NumberNode):
return float(node.value)
def visit_list_node(self, node: ListNode):
return [child_node.visit(self) for child_node in node.children]
def visit_dict_node(self, node: DictNode):
return {key: child_node.visit(self) for key, child_node in node.children}
def process(root_node: Node):
processor = Processor()
return processor.process(root_node)
Here is the implementation using isinstance
checks against the node
type:
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
if isinstance(node, StringNode):
return node.value
elif isinstance(node, NumberNode):
return float(node.value)
elif isinstance(node, ListNode):
return [process_node(child_node) for child_node in node.children]
elif isinstance(node, DictNode):
return {key: process_node(child_node) for key, child_node in node.children}
else:
raise Exception('Unexpected node')
return process_node(root_node)
Finally here is the implementation using pattern matching:
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
match node:
case StringNode(value=str_value):
return str_value
case NumberNode(value=number_value):
return float(number_value)
case ListNode(children=child_nodes):
return [process_node(child_node) for child_node in child_nodes]
case DictNode(children=child_nodes):
return {key: process_node(child_node) for key, child_node in child_nodes}
case _:
raise Exception('Unexpected node')
return process_node(root_node)
Here are the lengths of the different implementations:
->
37 linesisinstance
checks ->
36 lines->
69 linesThe visitor pattern implementation is by far the most verbose solution, weighing in at
almost twice the length of the alternative implementations due to the large amount of
boilerplate that is necessary to achieve double dispatch. The pattern matching and
isinstance
check implementations are very similar in length for this trivial
example.
In each implementation, there are 2 operations performed on each node.
The visitor pattern and isinstance
check implementations separate these 2
operations, whereas the pattern matching approach combines the operations together. I
believe that it is the declarative nature of pattern matching, where the operations of
determining the type of the node and destructuring the node are combined into a single
clause, which allows pattern matching to express a concise solution to the problem. In
this trivial example, the advantage of pattern matching over the alternative of using a
sequence of if
-elif
-else
statements is not as
obvious as it would be when compared to a more complex example, where a sub-tree of nodes
might be matched based on their type and be destructured in a single clause.
I have seen elsewhere an argument that pattern matching should not be accepted into Python as it introduces a pseudo-DSL that is separate from the rest of the language. I agree that pattern matching might be viewed as a pseudo-DSL, but I believe that it is a good thing, if it allows the solution to certain classes of problems to be expressed in a concise manner. People often raise similar objections to operator overloading in other languages, whereas the presence of operator overloading in Python allows mathematical expressions involving custom numeric types such as vectors to be expressed in a natural way. Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner.
I really hope that the Steering Council accepts pattern matching into Python. I think that it allows for processing of heterogeneous graphs of objects using recursion in a concise, declarative manner. I would like to thank the authors of the Structural Pattern Matching PEP for their hard work in designing this feature and developing an implementation of it. I believe that it will be a wonderful addition to the language that I am very much looking forward to using.
On 11/23/20 8:15 AM, Brian Coleman wrote:
def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node')
You don't need the "else". And you can change all your "elif"s into "if"s too. Now your "isinstance" version is 35 lines. Shorter than the pattern matching version, roughly the same speed, works in current Python, eminently readable to anybody conversant in current Python. A very reasonable solution to the problem.
There should be one--and preferably only one--obvious way to do it,
//arry/
On 11/23/20 11:06 AM, Larry Hastings wrote:
On 11/23/20 8:15 AM, Brian Coleman wrote:
def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node')
You don't need the "else".
Without the "else" errors will pass silently.
And you can change all your "elif"s into "if"s too.
On average, doubling your comparisons and therefore becoming slower.
Now your "isinstance" version is 35 lines. Shorter than the pattern matching version
But without error checking.
roughly the same speed,
See above comment about becoming slower.
works in current Python, eminently readable to anybody conversant in current Python. A very reasonable solution to the problem.
Agreed. But (subjectively, of course) not as elegant as the match version, and deteriorates rapidly as complexity increases.
There should be one--and preferably only one--obvious way to do it,
But that is not a hard and fast rule or we wouldn't have decorators (zero lines saved), the "with" statement (two lines saved), and four different ways spanning three different protocols to write an iterator [1].
-- ~Ethan~
[1] https://stackoverflow.com/questions/19151/build-a-basic-python-iterator/7542...
On Mon, Nov 23, 2020, 2:58 PM Ethan Furman
if isinstance(node, StringNode):
return node.value
elif isinstance(node, NumberNode):
return float(node.value)
elif isinstance(node, ListNode):
return [process_node(child_node) for child_node in
node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node')
You don't need the "else".
Without the "else" errors will pass silently.
Not in the particular example. It can just be a bunch of ifs, each of which returns something. The last line of the function can be raise.
Obviously, epic and else are very important in general. I think I wouldn't use any for that particular code though.
On Tue, Nov 24, 2020 at 7:00 AM Ethan Furman ethan@stoneleaf.us wrote: >
On 11/23/20 11:06 AM, Larry Hastings wrote:
On 11/23/20 8:15 AM, Brian Coleman wrote:
def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node')
You don't need the "else".
Without the "else" errors will pass silently.
And you can change all your "elif"s into "if"s too.
On average, doubling your comparisons and therefore becoming slower.
I think the implication is that, since all the successful if statements go into returns, you can leave off the "else" and just put the "raise" immediately. It's a special case that applies ONLY to situations where the pattern matching is the entire purpose of the function.
ChrisA
On 11/23/20 12:05 PM, Chris Angelico wrote:
On Tue, Nov 24, 2020 at 7:00 AM Ethan Furman ethan@stoneleaf.us wrote: >
On 11/23/20 11:06 AM, Larry Hastings wrote:
On 11/23/20 8:15 AM, Brian Coleman wrote:
def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node')
You don't need the "else".
Without the "else" errors will pass silently.
And you can change all your "elif"s into "if"s too.
On average, doubling your comparisons and therefore becoming slower.
I think the implication is that, since all the successful if statements go into returns, you can leave off the "else" and just put the "raise" immediately. It's a special case that applies ONLY to situations where the pattern matching is the entire purpose of the function.
Ah, of course. Thank you for correction.
-- ~Ethan~
Larry Hastings wrote:
On 11/23/20 8:15 AM, Brian Coleman wrote:
def process(root_node: Node): def process_node(node: Node): if isinstance(node, StringNode): return node.value elif isinstance(node, NumberNode): return float(node.value) elif isinstance(node, ListNode): return [process_node(child_node) for child_node in node.children] elif isinstance(node, DictNode): return {key: process_node(child_node) for key, child_node in node.children} else: raise Exception('Unexpected node') You don't need the "else". And you can change all your "elif"s into "if"s too. Now your "isinstance" version is 35 lines. Shorter than the pattern matching version, roughly the same speed, works in current Python, eminently readable to anybody conversant in current Python. A very reasonable solution to the problem. There should be one--and preferably only one--obvious way to do it, //arry/
It was not my intention to suggest that the solution implemented in the shortest number of lines was superior.
I think that one of the advantages of the pattern matching implementation over the
sequence of isinstance
checks is that there is no need to worry about whether
using if
, elif
or else
is the best approach. There
is a single elegant and natural way to express the solution with pattern matching.
I have a little bit of skepticism about the pattern matching syntax, for similar reasons to those Larry expresses, and that Steve Dower mentioned on Discourse.
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()}
I don't disagree that the pattern mini-language looks nice as syntax. But there's nothing about that mini-language that couldn't be put in a library (with the caveat that patterns would need to be quoted in some way).
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
David Mertz wrote:
I have a little bit of skepticism about the pattern matching syntax, for similar reasons to those Larry expresses, and that Steve Dower mentioned on Discourse. Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this: m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()} I don't disagree that the pattern mini-language looks nice as syntax. But there's nothing about that mini-language that couldn't be put in a library (with the caveat that patterns would need to be quoted in some way).
What you are proposing here is very similar in effect to executing pattern
matching statements using eval
. What is the advantage of implementing the
pattern matching functionality in a library if the user interface for that functionality
has the same pitfalls as eval
?
On Mon, Nov 23, 2020 at 9:02 PM Brian Coleman brianfcoleman@gmail.com wrote:
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val.s) elif m.case("[a, 5, 6, b]"): process_two_free_vars(m.val.a, m.val.b)
What you are proposing here is very similar in effect to executing pattern
matching statements using eval
. What is the advantage of implementing the
pattern matching functionality in a library if the user interface for that
functionality has the same pitfalls as eval
?
I don't understand the similarity with eval
that you are
suggesting.
My hypothetical library would store the value passed as initializer to
Matcher
, and provide a method .case()
. That method would need
to do
some moderately complicated parsing of the pattern passed into it, but
using parsing techniques, not any eval() call. Btw. I modified my above
strawman just slightly to what might be a bit better API.
If there are any free variables in the pattern, they would be provided by
the Matcher object. For example, they could be attributes of the property
m.val
. Or I suppose we could make them attributes of the Matcher object
itself, e.g. m.a
and m.b
, but I think name conflicts with e.g.
m.case
. Anyway, it's a strawman not an API design.
If the pattern looked kinda like a class constructor (i.e. exactly the same
as PEP 634 rules), the .case()
method would do an isinstance()
check
before doing its other stuff. If the pattern looks like a list, it would
scan the freevars inside it and match the constant values. And so on.
The actual return value from .m.case(...)
would be True or False (or at
least something truthy or falsy). However, it MIGHT create some new
attributes (or keys, or something else) on the Matcher object if the
pattern actually matched.
I agree the above is slightly less readable than PEP 635 syntax, but it seems to have the entire power of the proposal without changing Python syntax.
-- The dead increasingly dominate and strangle both the living and the not-yet born. Vampiric capital and undead corporate persons abuse the lives and control the thoughts of homo faber. Ideas, once born, become abortifacients against new conceptions.
David Mertz wrote:
On Mon, Nov 23, 2020 at 9:02 PM Brian Coleman brianfcoleman@gmail.com wrote:
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this: m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val.s) elif m.case("[a, 5, 6, b]"): process_two_free_vars(m.val.a, m.val.b) What you are proposing here is very similar in effect to executing pattern matching statements using eval. What is the advantage of implementing the pattern matching functionality in a library if the user interface for that functionality has the same pitfalls as eval? I don't understand the similarity with eval that you are suggesting. My hypothetical library would store the value passed as initializer to Matcher, and provide a method .case(). That method would need to do some moderately complicated parsing of the pattern passed into it, but using parsing techniques, not any eval() call. Btw. I modified my above strawman just slightly to what might be a bit better API. If there are any free variables in the pattern, they would be provided by the Matcher object. For example, they could be attributes of the property m.val. Or I suppose we could make them attributes of the Matcher object itself, e.g. m.a and m.b, but I think name conflicts with e.g. m.case. Anyway, it's a strawman not an API design. If the pattern looked kinda like a class constructor (i.e. exactly the same as PEP 634 rules), the .case() method would do an isinstance() check before doing its other stuff. If the pattern looks like a list, it would scan the freevars inside it and match the constant values. And so on. The actual return value from .m.case(...) would be True or False (or at least something truthy or falsy). However, it MIGHT create some new attributes (or keys, or something else) on the Matcher object if the pattern actually matched. I agree the above is slightly less readable than PEP 635 syntax, but it seems to have the entire power of the proposal without changing Python syntax.
To be more precise, the similarity that I see to eval
is that
syntax errors in the patterns that are passed to the case
method cannot be
detected at compile time and instead will be detected at runtime. Building the syntax into
the language gives the advantage of producing a syntax error at compile time. It also
makes it easier for linters and type checkers to validate the pattern matching clauses if
the syntax is built into the language.
David Mertz wrote:
On Mon, Nov 23, 2020 at 9:02 PM Brian Coleman brianfcoleman@gmail.com wrote:
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this: m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val.s) elif m.case("[a, 5, 6, b]"): process_two_free_vars(m.val.a, m.val.b) What you are proposing here is very similar in effect to executing pattern matching statements using eval. What is the advantage of implementing the pattern matching functionality in a library if the user interface for that functionality has the same pitfalls as eval? I don't understand the similarity with eval that you are suggesting. My hypothetical library would store the value passed as initializer to Matcher, and provide a method .case(). That method would need to do some moderately complicated parsing of the pattern passed into it, but using parsing techniques, not any eval() call. Btw. I modified my above strawman just slightly to what might be a bit better API. If there are any free variables in the pattern, they would be provided by the Matcher object. For example, they could be attributes of the property m.val. Or I suppose we could make them attributes of the Matcher object itself, e.g. m.a and m.b, but I think name conflicts with e.g. m.case. Anyway, it's a strawman not an API design. If the pattern looked kinda like a class constructor (i.e. exactly the same as PEP 634 rules), the .case() method would do an isinstance() check before doing its other stuff. If the pattern looks like a list, it would scan the freevars inside it and match the constant values. And so on. The actual return value from .m.case(...) would be True or False (or at least something truthy or falsy). However, it MIGHT create some new attributes (or keys, or something else) on the Matcher object if the pattern actually matched. I agree the above is slightly less readable than PEP 635 syntax, but it seems to have the entire power of the proposal without changing Python syntax.
Because syntax errors in the patterns passed to the case
method are detected at runtime, rather than at compile time, it is necessary to ensure
that you have code coverage of every call to a case
method to be confident
that there are no syntax errors in the patterns.
By comparison, if the pattern matching syntax is built into the language, the compiler
will detect syntax errors in any and all patterns in case
clauses. I think
that this is a major advantage as compared to implementing pattern matching via a
library.
On 11/23/2020 3:44 PM, David Mertz wrote:
I have a little bit of skepticism about the pattern matching syntax, for similar reasons to those Larry expresses, and that Steve Dower mentioned on Discourse.
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()}
I don't disagree that the pattern mini-language looks nice as syntax. But there's nothing about that mini-language that couldn't be put in a library (with the caveat that patterns would need to be quoted in some way).
I just commented on Steve's post over on Discourse. The problem with this is that the called function (m.case, here) needs to have access to the caller's namespace in order to resolve the expressions, such as StringNode and PairNone. This is one of the reasons f-strings weren't implemented as a function, and is also the source of many headaches with string type annotations.
My conclusion is that if you want something that operates on DSLs (especially ones that can't be evaluated as expressions), the compiler is going to need to know about it somehow so it can help you with it. I wish there were a general-purpose mechanism for this. Maybe it's PEP 638, although I haven't really investigated it much, and pattern matching might be a bad fit for it.
Eric
On Mon, Nov 23, 2020, 4:32 PM Eric V. Smith
I just commented on Steve's post over on Discourse. The problem with this is that the called function (m.case, here) needs to have access to the caller's namespace in order to resolve the expressions, such as StringNode and PairNone. This is one of the reasons f-strings weren't implemented as a function, and is also the source of many headaches with string type annotations.
Is this really true though? Yes, it would require magic of the sort
zero-argument super()
does. But can't the Matcher constructor capture the
locals one step up the stack on instance creation? That's largely why my
strawman version is slightly different from Steve's strawman.
I'd put the question this way: assuming Matcher can be written (with a bit of stack magic), and assuming that the strings inside m.case() calls are exactly the same mini-languague PEP 634 proposes, would you want a syntax change?
That's not rhetorical, I'm of mixed feeling myself.
On 11/23/20 1:49 PM, David Mertz wrote:
On Mon, Nov 23, 2020, 4:32 PM Eric V. Smith
I just commented on Steve's post over on Discourse. The problem with this is that the called function (m.case, here) needs to have access to the caller's namespace in order to resolve the expressions, such as StringNode and PairNone. This is one of the reasons f-strings weren't implemented as a function, and is also the source of many headaches with string type annotations.
Is this really true though? Yes, it would require magic of the sort zero-argument
super()
does. But can't the Matcher
constructor capture the locals one step up the stack on instance creation? That's largely
why my strawman version is
slightly different from Steve's strawman.
I'd put the question this way: assuming Matcher can be written (with a bit of stack magic), and assuming that the strings inside m.case() calls are exactly the same mini-languague PEP 634 proposes, would you want a syntax change?
That's not rhetorical, I'm of mixed feeling myself.
I would, yes. Writing Python code in strings to be processed by another function/library is a pain:
All in all, it feels very inelegant to me.
-- ~Ethan~
On 11/23/2020 4:49 PM, David Mertz wrote:
On Mon, Nov 23, 2020, 4:32 PM Eric V. Smith
I just commented on Steve's post over on Discourse. The problem
with this is that the called function (m.case, here) needs to have
access to the caller's namespace in order to resolve the
expressions, such as StringNode and PairNone. This is one of the
reasons f-strings weren't implemented as a function, and is also
the source of many headaches with string type annotations.
Is this really true though? Yes, it would require magic of the sort
zero-argument super()
does. But can't the Matcher constructor
capture the locals one step up the stack on instance creation? That's
largely why my strawman version is slightly different from Steve's
strawman.
Beyond the issue of stack access being frowned on, there are some practical problems. One that's given in PEP 498 is closures accessing variables that aren't otherwise referenced:
def outer(x): ... def inner(): ... return 'x={x}'.format_map(locals()) ... return inner ... outer(42)() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 3, in inner KeyError: 'x'
Whereas an f-string in that scenario does work:
def outer(x): ... def inner(): ... return f'x={x}' ... return inner ... outer(42)() 'x=42'
I'd put the question this way: assuming Matcher can be written (with a bit of stack magic), and assuming that the strings inside m.case() calls are exactly the same mini-languague PEP 634 proposes, would you want a syntax change?
No, I wouldn't!
Something that gets brought up every now and again is: what if there were a way to pass an AST to a function, such that it received the AST instead of the evaluated value for a parameter. Let's say that backticks (`) meant "compute the AST of the enclosed expression", and that was passed it to the function. (I always choose backticks for this example because we all know it isn't going to happen.)
Then you write your original example using backticks instead of quotes, and m.case would get an AST it could inspect. It would probably still need some help in evaluating the AST nodes in the caller's context, but at least it would be theoretically possible to do so with the compiler's assistance.
Another option would be the function itself saying "give me an AST instead of evaluating a particular parameter". But that's all but impossible, since the compiler couldn't look at the called function to know it wants to do that.
I think we're in python-ideas land now.
Eric
>
That's not rhetorical, I'm of mixed feeling myself.
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZQBAYYR5... Code of Conduct: http://python.org/psf/codeofconduct/
>
I'd put the question this way: assuming Matcher can be written (with a bit of stack magic), and assuming that the strings inside m.case() calls are exactly the same mini-languague PEP 634 proposes, would you want a syntax change?
No, I wouldn't!
Is one of us mixing up negations? Are you saying you do NOT support PEP 634 (i.e. syntax change)?
A reasonable enough opinion that several core developers hold. Just trying to understand.
On 11/23/2020 5:44 PM, David Mertz wrote: >
I'd
put the question this way: assuming Matcher can be written
(with a bit of stack magic), and assuming that the strings inside
m.case() calls are exactly the same mini-languague PEP 634
proposes, would you want a syntax change?
No, I wouldn't!
Is one of us mixing up negations? Are you saying you do NOT support PEP 634 (i.e. syntax change)?
A reasonable enough opinion that several core developers hold. Just trying to understand.
Sorry I wasn't clear. I wouldn't want a syntax change specific to matching if it could be done with a library. I just don't think it can be done with a library without other language changes. But I think those other language changes could be used in ways outside of just matching.
Eric
Hi Eric,
On 23/11/2020 9:32 pm, Eric V. Smith wrote:
On 11/23/2020 3:44 PM, David Mertz wrote:
I have a little bit of skepticism about the pattern matching syntax, for similar reasons to those Larry expresses, and that Steve Dower mentioned on Discourse.
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()}
I don't disagree that the pattern mini-language looks nice as syntax.
But there's nothing about that mini-language that couldn't be put in a
library (with the caveat that patterns would need to be quoted in some
way).
I just commented on Steve's post over on Discourse. The problem with this is that the called function (m.case, here) needs to have access to the caller's namespace in order to resolve the expressions, such as StringNode and PairNone. This is one of the reasons f-strings weren't implemented as a function, and is also the source of many headaches with string type annotations.
My conclusion is that if you want something that operates on DSLs (especially ones that can't be evaluated as expressions), the compiler is going to need to know about it somehow so it can help you with it. I wish there were a general-purpose mechanism for this. Maybe it's PEP 638, although I haven't really investigated it much, and pattern matching might be a bad fit for it.
Hygienic macros (PEP 638) solve two problems with a string based library (in my obviously biased opinion).
if m.case("StringNode(s)"): the pattern is just a string.
case!(StringNode(s)): the pattern is validated Python syntax.
Cheers, Mark.
Eric
Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CA6FJTEC... Code of Conduct: http://python.org/psf/codeofconduct/
On Mon, Nov 23, 2020 at 08:44:21PM +0000, David Mertz wrote:
Basically, I agree matching/destructuring is a powerful idea. But I also wonder how much genuinely better it is than a library that does not require a language change. For example, I could create a library to allow this:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()}
Look at all those strings. So much for syntax highlighting and compile-time syntax checking. Who needs them anyway?
# Perfectly legal syntax.
if m.case("StringNode[s)"):
...
elif m.case("[a, 5 6, b"):
...
It's a good thing we already have comprehensions, because if we didn't
have them, people would argue that they aren't necessary, we can just
write a comprehension
function that takes the loop expression as a
string:
# [(a + int(''.join(b)))*c) for a, b, c in items]
comprehension("(a + int(''.join(b)))*c)", items]
Imagine how silly we would be to dedicate actual syntax to comprehensions, when we can write a library to do it. It's all this syntactic sugar (comprehensions, decorators, f-strings, async, with statements, etc) that is killing Python.
wink
I don't disagree that the pattern mini-language looks nice as syntax. But there's nothing about that mini-language that couldn't be put in a library (with the caveat that patterns would need to be quoted in some way).
One thing that we get with pattern matching syntax is the absense of certain misfeatures.
# Oops, I forgot to use `elif`, now I have fall-through semantics
# which is widely agreed to be a bad idea.
m = Matcher([4, 5, 6, 7])
if m.case("[a, 5, 6, b]"):
print("first case")
if m.case("[4, a, b, 7]"):
print("second case")
As the library author, I hope you are prepared for many, many bug
reports about why people's code behaves differently if they use if
compared to elif
.
Another misfeature: the ability to scatter your pattern matching cases all over the place.
m = Matcher(expression)
do_this()
do_that()
if m.case(something):
process("this case")
do_another_thing()
class Spam:
# snip five pages of code
if m.case(something_else):
print("Did you forget we were inside a match pseudo-block?")
Objection: "But coders won't do that!"
No, sensible coders won't do that. With pattern matching syntax, even the other sort of coder can't do that.
Objection: "But they could put five pages of code inside a case block too."
True, but only if it is conditional to that specific case. You can't mix unconditional code and cases. And the extra indentation will hint that something odd is going on.
Another misfeature: the ability to modify the value being matched in the middle of the pattern matching pseudo-block.
m = Matcher(something)
if m.case(spam):
process("case 1")
m = Matcher(another_thing)
if m.case(eggs):
process("case 2")
People can write obfuscated, confusing, poor-quality code with anything, but syntax can limit their opportunities to do so.
-- Steve
On 24/11/20 9:44 am, David Mertz wrote:
m = Matcher(arbitrary_expression) if m.case("StringNode(s)"): process_string(m.val) elif m.case("[a, 5, 6, b]"): process_two_free_vars(*m.values) elif m.case("PairNone(a, b)"): a, b = m.values process_pair(a, b) elif m.case("DictNode"): foo = {key, process_node(child_node) for key, child_node in m.values.items()}
How does the Matcher object know the meanings of "StringNode", "PairNode", etc.? Presumably they're bound to classes in the namespace of the caller, but the Matcher doesn't know anything about that namespace.
-- Greg
On Mon, 23 Nov 2020 16:15:12 -0000 "Brian Coleman" brianfcoleman@gmail.com wrote:
Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner.
Uh, without regular expressions, a lot of functions would be massively more complicated and annoying to write.
However, your example shows that pattern matching barely makes common code shorter (admittedly, on this _one_ example, but that's the one you chose ;-)).
While I agree that regular expressions are far less Pythonic than the proposed variant of pattern matching, they also have a tremendously better cost/benefit ratio, IMHO.
Regards
Antoine.
Antoine Pitrou wrote:
On Mon, 23 Nov 2020 16:15:12 -0000 "Brian Coleman" brianfcoleman@gmail.com wrote:
Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner. Uh, without regular expressions, a lot of functions would be massively more complicated and annoying to write. However, your example shows that pattern matching barely makes common code shorter (admittedly, on this _one_ example, but that's the one you chose ;-)). While I agree that regular expressions are far less Pythonic than the proposed variant of pattern matching, they also have a tremendously better cost/benefit ratio, IMHO. Regards Antoine.
In my opinion, the object oriented solution to this problem is to use the
visitor pattern. The solution using the visitor pattern is almost twice the length of the
other solutions. Pattern matching is certainly no worse than the solution using a chain of
isinstance
checks in this case.
However as the patterns to be checked against a candidate object become more complex, I believe that the pattern matching solution will retain the same level of elegance and obviousness that it possesses in this example, whereas the solution involving a chain of comparisons will quickly degrade in terms of obviousness.
On 24/11/20 9:31 am, Brian Coleman wrote:
In my opinion, the object oriented solution to this problem is to use the visitor pattern.
Which is a good thing only if you believe that OO solutions are always better than non-OO ones.
IMO, the visitor pattern is a workaround for when your language doesn't permit introspection on types. It's unnecessary in Python and better avoided.
-- Greg
On Mon, Nov 23, 2020 at 8:20 AM Brian Coleman brianfcoleman@gmail.com wrote:
Take as an example a function designed to process a tree of nodes similar to that which might be output by a JSON parser. There are 4 types of node:
The function transforms a tree of nodes, beginning at the root node, and proceeding recursively through each child node in turn. The result is a Python object, with the following transformation applied to each node type:
->
Python str
->
Python float
->
Python list
->
Python dict
I have implemented this function using 3 different approaches:
isinstance
checks against the node typeI've always thought that the alternative to a "switch case" construct in Python (and I suppose most OO languages) is subclassing and method overriding. I guess that's what the "visitor pattern" is here, but it seems to be adding a bunch of unnecessary bolierplate.
So: given that you have a special "Node" object anyway, the thing to do is to have those Node object know how to unpack themselves. Then the top "traverse the tree" function becomes a single method or attribute access:
tree = node_tree.value
here's what the Nodes look like in this example:
class Node: def __init__(self, val): self._value = val
@property
def value(self):
return self._value
class StringNode(Node): pass
class NumberNode(Node): pass
class ListNode(Node): @property def value(self): return [item.value for item in self._value]
class DictNode(Node): @property def value(self): return {k: item.value for k, item in self._value.items()}
Of course, this requires that you have control of the Node objects, rather than getting them from some other library -- but that seems to be what all the examples here are anyway.
If you do need to parse out a tree of object that are not "special" already, then you need to do some type of pattern matching / isinstance checking. In this case, I wrote a function that builds up a tree of Nodes from arbitrary Python objects:
def make_nodes_from_obj(obj): if isinstance(obj, str): return StringNode(obj) if isinstance(obj, Real): return NumberNode(obj) if isinstance(obj, Sequence): return ListNode([make_nodes_from_obj(item) for item in obj]) if isinstance(obj, Mapping): return DictNode({k: make_nodes_from_obj(item) for k, item in obj.items()})
And that could benefit from pattern matching, I suppose, though it's not very compelling to me.
And in "real world" code, I've done just this -- building a system for saving / restoring dataclasses to/from JSON. In that case, each of the dataclasses knows how to save itself and build itself from JSON-compatible python objects (numbers, dicts, strings, lists) -- so again, no need for pattern matching there either. And what I really like about the approach of putting all the logic in the "nodes" is that I can make new types of nodes without having to touch the code at the "top" that visits those nodes.
In short -- I'm still looking for a more compelling example :-)
-CHB
--
Christopher Barker, Ph.D. Oceanographer
Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception
Chris.Barker@noaa.gov
On Mon, 18 Jan 2021 12:41:45 -0800 Chris Barker via Python-Dev python-dev@python.org wrote:
And in "real world" code, I've done just this -- building a system for saving / restoring dataclasses to/from JSON. In that case, each of the dataclasses knows how to save itself and build itself from JSON-compatible python objects (numbers, dicts, strings, lists) -- so again, no need for pattern matching there either. And what I really like about the approach of putting all the logic in the "nodes" is that I can make new types of nodes without having to touch the code at the "top" that visits those nodes.
Note that another approach is the little-used
functools.singledispatch
.
Regards
Antoine.