Take as an example a function designed to process a tree of nodes similar to that which might be output by a JSON parser. There are 4 types of node:
- A node representing JSON strings
- A node representing JSON numbers
- A node representing JSON arrays
- A node representing JSON dictionaries
The function transforms a tree of nodes, beginning at the root node, and proceeding recursively through each child node in turn. The result is a Python object, with the following transformation applied to each node type:
- A JSON string `->` Python `str`
- A JSON number `->` Python `float`
- A JSON array `->` Python `list`
- A JSON dictionary `->` Python `dict`
I have implemented this function using 3 different approaches:
- The visitor pattern
- `isinstance` checks against the node type
- Pattern matching
Here is the implementation using the visitor pattern:
```
from typing import List, Tuple
class NodeVisitor:
def visit_string_node(self, node: StringNode):
pass
def visit_number_node(self, node: NumberNode):
pass
def visit_list_node(self, node: ListNode):
pass
def visit_dict_node(self, node: DictNode):
pass
class Node:
def visit(visitor: NodeVisitor):
raise NotImplementedError()
class StringNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_string_node(self)
class NumberNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_number_node(self)
class ListNode(Node):
children: List[Node]
def visit(self, visitor: NodeVisitor):
visitor.visit_list_node(self)
class DictNode(Node):
children: List[Tuple[str, Node]]
def visit(self, visitor: NodeVisitor):
visitor.visit_dict_node(self)
class Processor(NodeVisitor):
def process(root_node: Node):
return root_node.visit(self)
def visit_string_node(self, node: StringNode):
return node.value
def visit_number_node(self, node: NumberNode):
return float(node.value)
def visit_list_node(self, node: ListNode):
return [child_node.visit(self) for child_node in node.children]
def visit_dict_node(self, node: DictNode):
return {key: child_node.visit(self) for key, child_node in node.children}
def process(root_node: Node):
processor = Processor()
return processor.process(root_node)
```
Here is the implementation using `isinstance` checks against the node type:
```
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
if isinstance(node, StringNode):
return node.value
elif isinstance(node, NumberNode):
return float(node.value)
elif isinstance(node, ListNode):
return [process_node(child_node) for child_node in node.children]
elif isinstance(node, DictNode):
return {key: process_node(child_node) for key, child_node in node.children}
else:
raise Exception('Unexpected node')
return process_node(root_node)
```
Finally here is the implementation using pattern matching:
```
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
match node:
case StringNode(value=str_value):
return str_value
case NumberNode(value=number_value):
return float(number_value)
case ListNode(children=child_nodes):
return [process_node(child_node) for child_node in child_nodes]
case DictNode(children=child_nodes):
return {key: process_node(child_node) for key, child_node in child_nodes}
case _:
raise Exception('Unexpected node')
return process_node(root_node)
```
Here are the lengths of the different implementations:
- Pattern matching `->` 37 lines
- `isinstance` checks `->` 36 lines
- The visitor pattern `->` 69 lines
The visitor pattern implementation is by far the most verbose solution, weighing in at almost twice the length of the alternative implementations due to the large amount of boilerplate that is necessary to achieve double dispatch. The pattern matching and `isinstance` check implementations are very similar in length for this trivial example.
In each implementation, there are 2 operations performed on each node.
- Determine the type of the node
- Destructure the node to extract the desired data
The visitor pattern and `isinstance` check implementations separate these 2 operations, whereas the pattern matching approach combines the operations together. I believe that it is the declarative nature of pattern matching, where the operations of determining the type of the node and destructuring the node are combined into a single clause, which allows pattern matching to express a concise solution to the problem. In this trivial example, the advantage of pattern matching over the alternative of using a sequence of `if`-`elif`-`else` statements is not as obvious as it would be when compared to a more complex example, where a sub-tree of nodes might be matched based on their type and be destructured in a single clause.
I have seen elsewhere an argument that pattern matching should not be accepted into Python as it introduces a pseudo-DSL that is separate from the rest of the language. I agree that pattern matching might be viewed as a pseudo-DSL, but I believe that it is a good thing, if it allows the solution to certain classes of problems to be expressed in a concise manner. People often raise similar objections to operator overloading in other languages, whereas the presence of operator overloading in Python allows mathematical expressions involving custom numeric types such as vectors to be expressed in a natural way. Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner.
I really hope that the Steering Council accepts pattern matching into Python. I think that it allows for processing of heterogeneous graphs of objects using recursion in a concise, declarative manner. I would like to thank the authors of the Structural Pattern Matching PEP for their hard work in designing this feature and developing an implementation of it. I believe that it will be a wonderful addition to the language that I am very much looking forward to using.