[Python-ideas] Match statement brainstorm

Franklin? Lee leewangzhong+python at gmail.com
Fri May 20 06:48:13 EDT 2016

I think there should be different syntaxes for matching equality and
binding patterns, and definitely different syntax for singular and
plural cases.

Let's try this:
- Equality:
    `case 5:`
- Conditional:
    `case if predicate(obj):`
- Pattern-matching:
    `case as [a, b, *_]:`
    `case as Point(x, y):`

Slightly-more-explicit checks, instead of simply `case 5:`.
- `case == 5:`
- `case is _sentinel:`
- `case is None:`
- `case < 6:`
- `case in (1,2,3):`
- `case in range(2, 5):`
- `case in int:`

The above uses an implicit object and is very different from how
Python usually works. Though I think it's mentally consistent (as in,
no contradictions), it's not an extrapolation of current syntax, and
might go against "explicit self". It isn't necessary, though: require
`case 5` and `case if obj == 5:`. I prefer `case == 5` and `case is
5`, though, even if that's not how other languages do it.

The last one treats a type as a collection of its instances, which I
just like. (I also like `SubClass < BaseClass`.)

On Thu, May 19, 2016 at 4:20 AM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> If we want this feature, it seems like we will need to
> explicitly mark either names to be bound or expressions to
> be evaluated and matched against.

Possible pre-fix markers:
        case as Point(zero, =y):
        case as Point(zero, !y):
        case as Point(zero, $y):
        case as Point(zero, &y):
        case as Point(zero, ^y):
        case as Point(zero, :y):

Possible post-fix markers:
        case as Point(zero, y=):
        case as Point(zero, y!):
        case as Point(zero, y:):

I think the colon won't be ambiguous in dict displays (it's only a
marker if one side of the colon is missing), but it might complicate
the parser. It'd be ambiguous in slicing, but I don't see
slice-matching being a thing.

(My favorite is probably `&y`, for a bad reason: it's very C-like.
That'd confuse C programmers when `&` doesn't work anywhere else and
`*` won't work at all.)

> If you wanted the second case to only match the tuple,
> you could write
>       case tuple(a, b, *_):

Slightly naughty. The tuple constructor only takes one argument.

On Thu, May 19, 2016 at 12:15 AM, Guido van Rossum <guido at python.org> wrote:
> A few things that might be interesting to explore:

I'll try out my syntax ideas on these.

> - match by value or set of values (like those PEPs)

    `case == 1:`
    `case in (1, 2, 3):`

> - match by type (isinstance() checks)

    `case in int:`
    `case if isinstance(obj, int):`
    `case if type(obj) == tuple:`

> - match on tuple structure, including nesting and * unpacking
> (essentially, try a series of destructuring assignments until one
> works)

    `case in tuple as (first, *middle, last):`
    `case if isinstance(obj, tuple) as (first, *middle, last):`
    `case if type(obj) == tuple as (first, *middle, last):`

> - match on dict structure? (extension of destructuring to dicts)

    I think it makes no sense to bind the keys to names, because
they'd just be chosen arbitrarily (unless it's an OrderedDict), so
let's say keys are evaluated and values are (by default) names to

    Let `**_` mean "other items".

    `case as {'foo': x, 'bar': y, **_}:`
    `case as {key0: val0, key1: val1}: # Binds val0 and val1.`
    `case as {'foo': foo_val, var_holding_bar: bar_val, **_}:`
        ^ Ew. I'd like names-to-bind to require special syntax.
    `case as dict(foo=foo_val, **{var_holding_bar: bar_val}, **_):`

    If we had an OrderedDict syntax, binding keys makes more sense.
            case as [k0: v0, **_, klast: vlast]:

    P.S.: If `**_` is allowed in matching, it should be allowed in
unpacking assignment.
            {'foo': x, 'bar': y, **_} = d

> - match on instance variables or attributes by name?

    One of the following:
    `case as object(foo=x, bar=y):`
    `case as Object(foo=x, bar=y):`
    `case as SpecialAttrMatchingThing(foo=x, bar=y):`

    `SpecialAttrMatchingThing` objects would be special in the match system.

> - match on generalized condition (predicate)?

    `case <= 4:`
    `case if is_something(the_obj):`
    `case as Point(x, y) if x == y:`
    `case if len(the_obj) < 5 as [first, second, *_] if isinstance(first, int):`

== Matching user classes ==

What about using pickle dunders for this? In particular,
`MyClass.__getnewargs__` and `MyClass.__getnewargs_ex__`. Pickling is
kind of related to pattern matching. Note that the classes don't have
to be immutable to be matchable.

When matching `p` to `Point(x, y)`, the match system calls
`Point.__getnewargs__(p)` (NOT `p.__getnewargs__`, thus allowing for
some subclass matching). The result is matched against a two-tuple,
and finally bound.

        # def match_constructor(obj, cls):
        if not isinstance(obj, cls):
            raise FailedMatch
            m = cls.__getnewargs_ex__
        except AttributeError:
            args, kwargs = m(obj) # Even for subclasses.
            return args, kwargs
            m = cls.__getnewargs__
        except AttributeError:
            raise FailedMatch
        args = m(obj)
        return args, {}

        # Back in the case:
            (x, y), [] = match_constructor(p, Point)
        except ValueError:
            raise FailedMatch

The problem is that these functions return (args, kwargs), and
optional args and pos_or_kw_params mess things up.

        Point(x, y)
        Point(x, y=y)
        # How can match system know that all of the following are
valid patterns?
        Something(x, y)
        Something(*args, **kwargs)

Solution 1: Additional args to the pickle methods (or make a new
method which could sometimes be used for pickling):
    - nargs: Number of args. An int, or a integer range (for `*args`).
A `range` doesn't allow unbounded, so use `slice(min_nargs, None)`,
`(min_nargs, ...)`, or `(min_nargs, None)`.
    - kws: Keywords. To represent **kwargs, either add another arg or
pass `...` or `None` as a keyword.

Solution 2: After receiving (args, kwargs) from the class, inspect the
signature of the class constructor and just figure out how it works.

== Additional Resources ==

* MacroPy implements case classes, which are similar to what Haskell
uses for pattern matching on constructors. I'm not sure that it lends
insight to the `Point(x, y)` case, because MacroPy doesn't have
pattern matching, but maybe someone else would.

* "Pattern matching in Python" (2009)
    An attempt at making a matching thing.

* PEP 275 -- Switching on Multiple Values

* PEP 3103 -- A Switch/Case Statement

More information about the Python-ideas mailing list