[Python-ideas] Pattern Matching Syntax
Ed Kellett
e+python-ideas at kellett.im
Thu May 3 18:34:58 EDT 2018
On 2018-05-03 20:17, Chris Angelico wrote:
>> def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date):
>> return match unit:
>> x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount})
>> 'months' => timedelta(days=30 * amount)
>> 'years' => timedelta(days=365 * amount)
>> 'cal_years' => now - now.replace(year=now.year - amount)
>
> And then this comes down to the same as all the other comparisons -
> the "x if x" gets duplicated. So maybe it would be best to describe
> this thus:
>
> match <expr> :
> <expr> | (<comp_op> <expr>) => <expr>
>
> If it's just an expression, it's equivalent to a comp_op of '=='. The
> result of evaluating the match expression is then used as the left
> operand for ALL the comparisons. So you could write your example as:
>
> return match unit:
> in ('days', 'hours', 'weeks') => timedelta(**{unit: amount})
> 'months' => timedelta(days=30 * amount)
> 'years' => timedelta(days=365 * amount)
> 'cal_years' => now - now.replace(year=now.year - amount)
>
> Then there's room to expand that to a comma-separated list of values,
> which would pattern-match a tuple.
I believe there are some problems with this approach. That case uses no
destructuring at all, so the syntax that supports destructuring looks
clumsy. In general, if you want to support something like:
match spec:
(None, const) => const
(env, fmt) if env => fmt.format(**env)
then I think something like the 'if' syntax is essential for guards.
One could also imagine cases where it'd be useful to guard on more
involved properties of things:
match number_ish:
x:str if x.lower().startswith('0x') => int(x[2:], 16)
x:str => int(x)
x => x #yolo
(I know base=0 exists, but let's imagine we're implementing base=0, or
something).
I'm usually against naming things, and deeply resent having to name the
x in [x for x in ... if ...] and similar constructs. But in this
specific case, where destructuring is kind of the point, I don't think
there's much value in compromising that to avoid a name.
I'd suggest something like this instead:
return match unit:
_ in {'days', 'hours', 'weeks'} => timedelta(**{unit: amount})
...
So a match entry would be one of:
- A pattern. See below
- A pattern followed by "if" <expr>, e.g.:
(False, x) if len(x) >= 7
- A comparison where the left-hand side is a pattern, e.g.:
_ in {'days', 'hours', 'weeks'}
Where a pattern is one of:
- A display of patterns, e.g.:
{'key': v, 'ignore': _}
I think *x and **x should be allowed here.
- A comma-separated list of patterns, making a tuple
- A pattern enclosed in parentheses
- A literal (that is not a formatted string literal, for sanity)
- A name
- A name with a type annotation
To give a not-at-all-motivating but hopefully illustrative example:
return match x:
(0, _) => None
(n, x) if n < 32 => ', '.join([x] * n)
x:str if len(x) <= 5 => x
x:str => x[:2] + '...'
n:Integral < 32 => '!' * n
Where:
(0, 'blorp') would match the first case, yielding None
(3, 'hello') would match the second case, yielding
"hello, hello, hello"
'frogs' would match the third case, yielding "frogs"
'frogs!' would match the fourth case, yielding "fr..."
3 would match the fifth case, yielding '!!!'
I think the matching process would mostly be intuitive, but one detail
that might raise some questions: (x, x) could be allowed, and it'd make
a lot of sense for that to match only (1, 1), (2, 2), ('hi', 'hi'), etc.
But that'd make the _ convention less useful unless it became more than
a convention.
All in all, I like this idea, but I think it might be a bit too heavy to
get into Python. It has the feel of requiring quite a lot of new things.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180503/ea0cb12b/attachment-0001.sig>
More information about the Python-ideas
mailing list