I've taken a look through PEP 622 and I've been thinking about how it
could be used with sympy.
In principle case/match and destructuring should be useful for sympy
because sympy has a class Basic which defines a common structure for
~1000 subclasses. There are a lot of places where it is necessary to
dispatch on the type of some object including in places that are
performance sensitive so those would seem like good candidates for
case/match. However the PEP doesn't quite seem as I hoped because it
only handles positional arguments indirectly and it does not seem to
directly handle types with variadic positional args.
The objects I refer to in sympy represent mathematical expressions e.g.:
>>> from sympy import *
>>> x, y = symbols('x, y')
>>> expr = x**2 + 2*x*y
>>> expr
x**2 + 2*x*y
You can see the structure of the object explicitly using sympy's srepr function:
>>> print(srepr(expr))
Add(Pow(Symbol('x'), Integer(2)), Mul(Integer(2), Symbol('x'), Symbol('y')))
There are a bunch of classes there (Add, Pow, Symbol, Mul, Integer)
but these are a tiny subset of the possibilities. The key feature of
Basic instances is that they have an .args attribute which can be used
to rebuild the object like:
>>> expr.args
(x**2, 2*x*y)
>>> type(expr)
<class 'sympy.core.add.Add'>
>>> type(expr)(*expr.args)
x**2 + 2*x*y
>>> type(expr)(*expr.args) == expr
True
This is known as the func-args invariant in sympy and is used to
destructure and rebuild the expression tree in different ways e.g. for
performing a substitution:
>>> expr.subs(x, 5)
10*y + 25
All Basic classes are strictly constructed using positional only
arguments and not keyword arguments. In the PEP it seems that we can
handle positional arguments when their number is fixed by the type.
For example a simplified version of Pow could be:
class Pow:
def __init__(self, base, exp):
self.args = (base, exp)
__match_args__ == ("base", "exp")
@property
def base(self):
return self.args[0]
@property
def exp(self):
return self.args[1]
Then I could match Pow in case/match with
obj = Pow(Symbol('x'), Integer(4))
match obj:
case Pow(base, exp):
# do stuff with base, exp
It seems awkward and inefficient though to go through __match_args__
and the base and exp property-methods to match the positional
arguments when they are already available as a tuple in obj.args. Note
that performance is a concern: just dispatching on isinstance() has a
measurable overhead in sympy code which is almost always CPU-bound.
The main problem though is with variadic positional arguments. For
example sympy has a symbolic Tuple class which is much like a regular
python tuple except that it takes multiple positional args rather than
a single iterable arg:
class Tuple:
def __init__(self, *args):
self.args = args
So now how do I match a 2-Tuple of two integers? I can't use
__match_args__ because that's a class attribute and different
instances have different numbers of args. It seems I can do this:
obj = Tuple(2, 4)
match obj:
case Tuple(args=(2, 4)):
That's awkward though because it doesn't match the constructor syntax
which strictly uses positional-only args. It also doesn't scale well
with nesting:
obj = Tuple(Tuple(1, 2), Tuple(3, 4))
match obj:
case Tuple(args=(Tuple(args=(1, 2)), Tuple(args=(3, 4))):
# handle ((1, 2), (3, 4)) case
Another option would be to fake a single positional argument for
matching purposes:
class Tuple:
__match_args__ == ("args",)
def __init__(self, *args):
self.args = args
match obj:
case Tuple((Tuple((1, 2)), Tuple((3, 4)))):
This requires an extra level of brackets for each node and also
doesn't match the actual constructor syntax: evaluating that pattern
in sympy turns each Tuple into a 1-Tuple containing another Tuple of
the args:
>>> t = Tuple((Tuple((1, 2)), Tuple((3, 4))))
>>> print(srepr(t))
Tuple(Tuple(Tuple(Tuple(Integer(1), Integer(2))),
Tuple(Tuple(Integer(3), Integer(4)))))
I've used Tuple in the examples above but the same applies to all
variadic Basic classes: Add, Mul, And, Or, FiniteSet, Union,
Intersection, ProductSet, ...
From a first glimpse of the proposal I thought I could do matches like this:
match obj:
case Add(Mul(x, y), Mul(z, t)) if y == t:
case Add(*terms):
case Mul(coeff, *factors):
case And(Or(A, B), Or(C, D)) if B == D:
case Union(Interval(x1, y1), Interval(x2, y2)) if y1 == x2:
case Union(Interval(x, y), FiniteSet(*p)) | Union(FiniteSet(*p),
Interval(x, y)):
case Union(*sets):
Knowing the sympy codebase each of those patterns would look quite
natural because they resemble the constructors for the corresponding
objects (as intended in the PEP). It seems instead that many of these
constructors would need to have args= so it becomes:
match obj:
case Add(args=(Mul(args=(x, y)), Mul(args=(z, t)))) if y == t:
case Add(args=terms):
case Mul(args=(coeff, *factors)):
case And(args=(Or(args=(A, B)), Or(args=(C, D)))) if C == D:
case Union(args=(Interval(x1, y1), Interval(x2, y2))) if y1 == x2:
case Union(args=(Interval(x, y), FiniteSet(args=p))) |
Union(args=(FiniteSet(args=p), Interval(x, y))):
case Union(args=sets):
Each of these looks less natural as they don't match the constructors
and the syntax gets messier with nesting.
Oscar