[Python-ideas] Pattern matching

Chris Angelico rosuav at gmail.com
Tue Apr 7 21:52:23 CEST 2015


On Wed, Apr 8, 2015 at 5:25 AM, Szymon Pyżalski <szymon at pythonista.net> wrote:
> Syntax for pattern matching
> - -------------------------------
>
> The syntax could look something like this::
>
> for object:
>     pattern1: statement1
>     pattern2:
>         statement2
>         statement3
>     pattern3 as var1, var2:
>         statement4
>         statement5
>
> For the simplest cases this works simply by calling appropriate
> statement block. If the ``__pattern__`` method returns a mapping, then
> the values from it will be merged into the local namespace. If it
> returns a sequence and there is the ``as`` clause, the values will be
> assigned to variables specified.

So, tell me if I'm understanding you correctly: The advantage over a
basic if/elif/else tree is that it can assign stuff as well as giving
a true/false result? Because if all you want is the true/false,
there's not a lot of advantage over what currently exists.

The sequence and "as" clause syntax is reasonable, but I do not like
the idea that a mapping's keys would quietly become local name
bindings. It'd be like "from ... import *" inside a function -
suddenly _any_ name could become local, without any way for the
compiler to know. Also - although this is really just bikeshedding -
not a fan of the use of 'for' here. I'm imagining code something like
this:

for value in gen(): # this one iterates
    for value: # this one doesn't
        int: yield ("(%d)" if value<0 else "%d") % value
        str: yield repr(value)
        datetime.datetime: yield value.strftime(timefmt)
        ...: yield str(value)

Come to think of it, not a fan of ellipsis here either; "else" seems
better. But that's minor bikeshedding too.

The key is to show how this is materially better than an if/elif tree
*and* better than a dictionary lookup. (Obviously isinstance checks
are superior to dict lookups based on type(value), but I don't know
how often people actually write code like the above.)

> Besides the above syntax the patterns could be also used to register
> functions for dispatching. I dont' know how precisely it would
> integrate with PEP 484, but I agree that any valid type hint should
> also be a valid pattern.
>
> Existing objects as patterns
> - ----------------------------------
>
> * Types should match their instances
> * Tuples should match sequences that have the same length and whose
>   elements match against the elements of a tuple (subpatterns)
> * Dictionaries should match mappings that contain all the keys in the
>   pattern and the values match the subpatterns under these keys
> * Regular expressions should match strings. For named groups they
>   should populate the local namespace.
> * The ``Ellipsis`` object can be used as a match-all pattern.

This is where the type hints could save you a lot of trouble. They
already define a sloppy form of isinstance checking, so you don't need
to do all the work of defining tuples, dicts, etc - all you have to
say is "typing.py type hint objects match any object which they would
accept", and let the details be handled elsewhere. That'd get you
Union types and such, as well as what you're talking about above.

>     for point:
>         (Number, Number) as x, y:
>             print_point(x, y)
>         {'x': Number, 'y': Number}:
>             print_point(x, y)
>         re.compile('(?P<x>[0-9]+)-(?P<y>[0-9])+'):
>         # Or re.compile('([0-9]+)-([0-9])+') as x, y:
>             print_point(int(x), int(y))
>         ...:
>             raise TypeError('Expected something else')

I'm not sure how to rescue the "regex with named groups" concept, but
this code really doesn't look good. Using "as" wouldn't work reliably,
since dicts are iterable (without a useful order). Something along the
lines of:

x, y from re.compile('(?P<x>[0-9]+)-(?P<y>[0-9])+'):

or maybe switch the arguments (to be more like "from ... import x,
y")? But whatever it is, I'd prefer to be explicit: This has to return
a dictionary containing keys 'x' and 'y' (what if there are others?
Ignore them?), and then they will be bound to identical names.

All in all, an interesting idea, but one that's going to require a
good bit of getting-the-head-around, so it wants some strong use
cases. Got some real-world example code that would benefit from this?

ChrisA


More information about the Python-ideas mailing list