[Python-ideas] Pattern matching

Anthony Towns aj at erisian.com.au
Wed Apr 8 07:12:43 CEST 2015


On 8 April 2015 at 05:25, Szymon Pyżalski <szymon at pythonista.net> wrote:

> Any object can be a pattern. If an object has a special method (called
> ``__pattern__`` or ``__match__``) defined then this method will govern
> how this object behaves as pattern. If not - pattern matching works
> the same as ``__eq__``
>


> The ``__pattern__`` method can return ``False`` for no match. For a
> match it can return ``True`` (simplest patterns) a sequence or a mapping
>

​Always returning a sequence or None would seem simpler to deal with,
wouldn't it?

I'm imaginging the default implementations would be something like:

  class object:
    def __match__(self, item):
      if item == self​:
        return [item]
      else:
        return None

  class type:
    def __match__(cls, item):
      if isinstance(item, cls):
        return [item]
      else:
        return None

  class tuple:
    def __match__(self, item):
      if isinstance(item, tuple):
        if all( si.__match__(ii) for si,ii in zip(self, item) ):
          return item

      return None

Syntax for pattern matching
> - -------------------------------
> The syntax could look something like this::
>
> for object:
>     pattern1: statement1
>

Is the two level indent actually a win? It saves re-evaluating the
expression, but you could just as easly have:

  object = f(x, y+w, g(z/3 for z in x))
  match object ~~ pattern1 as a,b,c:
    statements...
  ormatch object ~~ pattern2 as a,d:
    statements...

​but in that case you're coming pretty close to an 'if' block that's able
to set locals:

  object = ...
  if object ~~ pattern1 as a,b,c:
    ...
  elif object ~~ pattern2 as a,d:
    ...

(Using ~~ as a placeholder for the syntax for "matches against")

    for point:
>         (Number, Number) as x, y:
>             print_point(x, y)
>         {'x': Number, 'y': Number}:
>             print_point(x, y)
>         re.compile(
> ​​
> '(?P<x>[0-9]+)-(?P<y>[0-9])+'):
>         # Or re.compile('([0-9]+)-([0-9])+') as x, y:
>             print_point(int(x), int(y))
>         ...:
>             raise TypeError('Expected something else')
>

My understanding of the way Haskell does name assignment in case matching
is that it lets you put placeholders in a constructor. So where you might
say

  x = MyX(a=1, b=2)

to create an object 'x', you could then say something along the lines of:

  x ~~ MyX(a=foo, b=bar)

to pull the values (1, 2) back out (as foo and bar). Doing exactly that
doesn't make sense for python because reversing a constructor doesn't make
much sense. If you could make that work, you'd get the python version of
Haskell pattern matching looking like:

 re_pt = re.compile(​'(?P<x>[0-9]+)-(?P<y>[0-9])+')
 case point of:
   (Number(x), Number(y)): ...
   ('x': Number(x), 'y': Number(y)): ...
   re_pt(x=x, y=y): ..

​where you're at least mentioning the 'x' and 'y' variables that are
getting filled in.

Okay, probably horrible idea:

Matching syntax looks like:

  case point is:
    (x:=Number, y:=Number): ...

a:=x is the same as calling assigning_match(locals(), x, "a") run in the
local scope.
a,b:=x is the same as calling assigning_match(locals(), x, ("a", "b")).

'case obj is: \n pattern[0]: stmt[0] \n pattern[1]: stmt[1]' does:

   for i in range(2):
     try:
       pattern[i].__match__(obj)
     except NoMatch:
       continue
     stmt[i]
     break


__match__() raises NoMatch on failure, or returns the match:

  class object:
    def __match__(self, item):
      if self == item:
        return item
      raise NoMatch

  class type:
    def __match__(self, item):
      if isinstance(item, self):
        return item
      raise NoMatch

  class tuple:
    def __match__(self, item):
      if isinstance(item, tuple):
        if all( s_i.__match__(i_i) for s_i, i_i in zip(self, item) ):
            return item
      raise NoMatch

  class dict:
    def __match__(self, item):
      if isinstance(item, dict):
        if item.keys() == self.keys():
          if all(s_v.__match__(item[s_k]) for s_k,s_v in self.items()):
            return item
      raise NoMatch

​  class SRE_Pattern:
    def __match__(self, item):
       m = self.match(it​em)
       if m is None:
           raise NoMatch
       return m.groups()

​assigning_match works something like:​

  class assigning_match(object):
    def __init__(self, l, matchobj, varname):
      assert isinstance(varname, str) or isinstance(varname, tuple)

      self.l = l
      self.m = matchobj
      self.n = varname

    def __match__(self, item):
      r = self.m.__match__(item)
      if isinstance(self.n, str):
          self.l[self.n] = other
      else:
          for n, r_i in zip(self.n, r):
              self.l[n] = r_i
      return r

Then the example looks like:

  case point is:
    (x:=Number, y:=Number):
        ...
    {"x": x:=Number, "y": y:=Number}:
        ...
    x,y:=re_pt:
        ...

Being able to pull out and name something that's a deep part of an object
seems like it could be useful; otherwise it's not adding much over if/elif.

Though maybe it'd be interesting to have a "deconstructor" protocol for
that instead, ie:

    Foo(a,b,key=c) = foo

is equivalent to something like:

    x = Foo.__deconstruct__(foo)
    a = x[0]
    b = x[1]
    c = x["key"]

I think it might even extend sanely to unpacking subobjects:

    Foo(Bar(a), b) = foo

becomes:

    x = Foo.__deconstruct__(foo)
    Bar(a) = x[0]
    b = x[1]

becomes:

    x = Foo.__deconstruct__(foo)
    x2 = Bar.__deconstruct__(x[0])
    a = x2[0]
    b = x[1]

Using "_" to accept and ignore any value would work fine there. If you
allowed literals instead of an identifier, you could treat:

    Foo(0, 99, key="x") = foo

as:

    x = Foo.__deconstruct__(foo)
    assert 0 == x[0]
    assert 99 == x[1]
    assert "x" = x["key"]

That'd be enough to use it for matching I think:

    case foo is:
       Foo(a,b): ...

would become:

    try:
         Foo(a,b) = foo
    except:
         continue
     ...
     break

But it might be confusing figuring out whether:

    b = 0
    case foo is:
        Foo(a,b): print(a)

should result in value checking:

    if foo ~~ Foo(a,0): print(a)

or value assignment:

    del b
    if foo ~~ Foo(a,b): print(a)

Eh, I don't know. Maybe something there is interesting though.

Cheers,
aj

-- 
Anthony Towns <aj at erisian.com.au>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150408/d5eee073/attachment-0001.html>


More information about the Python-ideas mailing list