[Python-ideas] Pattern matching
Anthony Towns
aj at erisian.com.au
Wed Apr 8 07:12:43 CEST 2015
On 8 April 2015 at 05:25, Szymon Pyżalski <szymon at pythonista.net> wrote:
> Any object can be a pattern. If an object has a special method (called
> ``__pattern__`` or ``__match__``) defined then this method will govern
> how this object behaves as pattern. If not - pattern matching works
> the same as ``__eq__``
>
> The ``__pattern__`` method can return ``False`` for no match. For a
> match it can return ``True`` (simplest patterns) a sequence or a mapping
>
Always returning a sequence or None would seem simpler to deal with,
wouldn't it?
I'm imaginging the default implementations would be something like:
class object:
def __match__(self, item):
if item == self:
return [item]
else:
return None
class type:
def __match__(cls, item):
if isinstance(item, cls):
return [item]
else:
return None
class tuple:
def __match__(self, item):
if isinstance(item, tuple):
if all( si.__match__(ii) for si,ii in zip(self, item) ):
return item
return None
Syntax for pattern matching
> - -------------------------------
> The syntax could look something like this::
>
> for object:
> pattern1: statement1
>
Is the two level indent actually a win? It saves re-evaluating the
expression, but you could just as easly have:
object = f(x, y+w, g(z/3 for z in x))
match object ~~ pattern1 as a,b,c:
statements...
ormatch object ~~ pattern2 as a,d:
statements...
but in that case you're coming pretty close to an 'if' block that's able
to set locals:
object = ...
if object ~~ pattern1 as a,b,c:
...
elif object ~~ pattern2 as a,d:
...
(Using ~~ as a placeholder for the syntax for "matches against")
for point:
> (Number, Number) as x, y:
> print_point(x, y)
> {'x': Number, 'y': Number}:
> print_point(x, y)
> re.compile(
>
> '(?P<x>[0-9]+)-(?P<y>[0-9])+'):
> # Or re.compile('([0-9]+)-([0-9])+') as x, y:
> print_point(int(x), int(y))
> ...:
> raise TypeError('Expected something else')
>
My understanding of the way Haskell does name assignment in case matching
is that it lets you put placeholders in a constructor. So where you might
say
x = MyX(a=1, b=2)
to create an object 'x', you could then say something along the lines of:
x ~~ MyX(a=foo, b=bar)
to pull the values (1, 2) back out (as foo and bar). Doing exactly that
doesn't make sense for python because reversing a constructor doesn't make
much sense. If you could make that work, you'd get the python version of
Haskell pattern matching looking like:
re_pt = re.compile('(?P<x>[0-9]+)-(?P<y>[0-9])+')
case point of:
(Number(x), Number(y)): ...
('x': Number(x), 'y': Number(y)): ...
re_pt(x=x, y=y): ..
where you're at least mentioning the 'x' and 'y' variables that are
getting filled in.
Okay, probably horrible idea:
Matching syntax looks like:
case point is:
(x:=Number, y:=Number): ...
a:=x is the same as calling assigning_match(locals(), x, "a") run in the
local scope.
a,b:=x is the same as calling assigning_match(locals(), x, ("a", "b")).
'case obj is: \n pattern[0]: stmt[0] \n pattern[1]: stmt[1]' does:
for i in range(2):
try:
pattern[i].__match__(obj)
except NoMatch:
continue
stmt[i]
break
__match__() raises NoMatch on failure, or returns the match:
class object:
def __match__(self, item):
if self == item:
return item
raise NoMatch
class type:
def __match__(self, item):
if isinstance(item, self):
return item
raise NoMatch
class tuple:
def __match__(self, item):
if isinstance(item, tuple):
if all( s_i.__match__(i_i) for s_i, i_i in zip(self, item) ):
return item
raise NoMatch
class dict:
def __match__(self, item):
if isinstance(item, dict):
if item.keys() == self.keys():
if all(s_v.__match__(item[s_k]) for s_k,s_v in self.items()):
return item
raise NoMatch
class SRE_Pattern:
def __match__(self, item):
m = self.match(item)
if m is None:
raise NoMatch
return m.groups()
assigning_match works something like:
class assigning_match(object):
def __init__(self, l, matchobj, varname):
assert isinstance(varname, str) or isinstance(varname, tuple)
self.l = l
self.m = matchobj
self.n = varname
def __match__(self, item):
r = self.m.__match__(item)
if isinstance(self.n, str):
self.l[self.n] = other
else:
for n, r_i in zip(self.n, r):
self.l[n] = r_i
return r
Then the example looks like:
case point is:
(x:=Number, y:=Number):
...
{"x": x:=Number, "y": y:=Number}:
...
x,y:=re_pt:
...
Being able to pull out and name something that's a deep part of an object
seems like it could be useful; otherwise it's not adding much over if/elif.
Though maybe it'd be interesting to have a "deconstructor" protocol for
that instead, ie:
Foo(a,b,key=c) = foo
is equivalent to something like:
x = Foo.__deconstruct__(foo)
a = x[0]
b = x[1]
c = x["key"]
I think it might even extend sanely to unpacking subobjects:
Foo(Bar(a), b) = foo
becomes:
x = Foo.__deconstruct__(foo)
Bar(a) = x[0]
b = x[1]
becomes:
x = Foo.__deconstruct__(foo)
x2 = Bar.__deconstruct__(x[0])
a = x2[0]
b = x[1]
Using "_" to accept and ignore any value would work fine there. If you
allowed literals instead of an identifier, you could treat:
Foo(0, 99, key="x") = foo
as:
x = Foo.__deconstruct__(foo)
assert 0 == x[0]
assert 99 == x[1]
assert "x" = x["key"]
That'd be enough to use it for matching I think:
case foo is:
Foo(a,b): ...
would become:
try:
Foo(a,b) = foo
except:
continue
...
break
But it might be confusing figuring out whether:
b = 0
case foo is:
Foo(a,b): print(a)
should result in value checking:
if foo ~~ Foo(a,0): print(a)
or value assignment:
del b
if foo ~~ Foo(a,b): print(a)
Eh, I don't know. Maybe something there is interesting though.
Cheers,
aj
--
Anthony Towns <aj at erisian.com.au>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150408/d5eee073/attachment-0001.html>
More information about the Python-ideas
mailing list