Hi Mark,

Reading that spec will take some time. Can you please summarize the differences in English, in a way that is about as precise as PEP 634? I have some comments inline below as well.

On Sat, Mar 27, 2021 at 10:16 AM Mark Shannon <mark@hotpy.org> wrote:

Hi Oscar,

Thanks for the feedback.

On 27/03/2021 4:19 pm, Oscar Benjamin wrote:
> On Sat, 27 Mar 2021 at 13:40, Mark Shannon <mark@hotpy.org> wrote:
>>
>
> Hi Mark,
>
> Thanks for putting this together.
>
>> As the 3.10 beta is not so far away, I've cut down PEP 653 down to the
>> minimum needed for 3.10. The extensions will have to wait for 3.11.
>>
>> The essence of the PEP is now that:
>>
>> 1. The semantics of pattern matching, although basically unchanged, are
>> more precisely defined.
>>
>> 2. The __match_kind__ special attribute will be used to determine which
>> patterns to match, rather than relying on the collections.abc module.
>>
>> Everything else has been removed or deferred.
>
> It would take me some time to compare exactly how this differs from
> the current state after PEP 634 but I certainly prefer the
> object-model based approach. It does seem that there are a lot of
> permutations of how matching works but I guess that's just trying to
> tie up all the different cases introduced in PEP 634.

It would be simpler if this was simply an informational PEP without proposing new features -- then we wouldn't have to rush.

You could then propose the new __match_kind__ attribute in a separate PEP, written more in the style of PEP 634, without pseudo code.

I find it difficult to wrap my head around the semantics of __match_kind__ because it really represents a few independent flags (with some constraints) but all the text is written using explicit, hard-to-read bitwise and/or operations. Let me give it a try.

- Let's call the four flag bits by short names: SEQUENCE, MAPPING, DEFAULT, SELF.

SEQUENCE and MAPPING are for use when an instance of a class appears in the subject position (i.e., for `match x`, we look for these bits in `type(x).__match_kind__`). Neither of these is set by default. At most one of them should be set.

- If SEQUENCE is set, the subject is treated like a sequence (this is set for list, tuple and other sequences, but not for str, bytes and bytearray).

- Similarly, MAPPING means the subject should be treated as a mapping, and is set for dict and other mapping types.

The DEFAULT and SELF flags are for use when a class is used in a class pattern (i.e., for `case cls(...)` we look for these bits in `cls.__match_kind__`). At most one of these should be set. DEFAULT is set on class `object` and anything that doesn't explicitly clear it.

- If DEFAULT is set, semantics of PEP 634 apply except for the special behavior enabled by the SELF flag.

- If SELF is set, `case cls(x)` binds the subject to x, and no other forms of `case cls(...)` are allowed.

- If neither DEFAULT nor SELF is set, `case cls(...)` does not take arguments at all.

Please correct any misunderstandings I expressed here! (And please include some kind of summary like this in your PEP.)

Also, I think that we should probably separate this out in two separate flag sets, one for subjects and one for class patterns -- it is pretty confusing to merge the flag sets into a single value when their applicability (subject or class pattern) is so different.

>> The PEP now has only the slightest changes to semantics, which should be
>> undetectable in normal use. For those corner cases where there is a
>> difference, it is to make pattern matching more robust.
>
> Maybe I misunderstood but it looks to me as if this (PEP 653) changes
> the behaviour of a mapping pattern in relation to extra keys. In PEP
> 634 extra keys in the target are ignored e.g.:
>
> obj = {'a': 1, 'b': 2}
> match(obj):
> case {'a': 1}:
> # matches obj because key 'b' is ignored
>
> In PEP 634 the use of **rest is optional if it is desired to catch the
> other keys but does not affect matching. Here in PEP 653 there is the
> pseudocode:
>
> # A pattern not including a double-star pattern:
> if $kind & MATCH_MAPPING == 0:
> FAIL
> if $value.keys() != $KEYWORD_PATTERNS.keys():
> FAIL

I missed that when updating the PEP, thanks for pointing it out.
It should be the same as for double-star pattern:

if not $value.keys() >= $KEYWORD_PATTERNS.keys():
FAIL

I'll update the PEP.

>
> My reading of that is that all keys would need to be match unless
> **rest is used to absorb the others.
>
> Is that an intended difference?
>
> Personally I prefer extra keys not to be ignored by default so to me
> that seems an improvement. If intentional then it should be listed as
> another semantic difference though.

I don't have a strong enough opinion either way.
I can see advantages to both ways of doing it.

Let's not change this. We carefully discussed and chose this behavior (ignore extra mapping keys, but don't ignore extra sequence items) for PEP 634 based on usability.

>
>> E.g. With PEP 653, pattern matching will work in the collections.abc
>> module. With PEP 634 it does not.
>
> As I understood it this proposes that match obj: should use the class
> attribute type(obj).__match_kind__ to indicate whether the object
> being matched should be considered a sequence or a mapping or
> something else rather than using isinstance(obj, Sequence) and
> isinstance(obj, Mapping). Is there a corner case here where an object
> can be both a Sequence and a Mapping? (How does PEP 634 handle that?)

If you define a class as a subclass of both collections.abc.Sequence and
collections.abc.Mapping, then PEP 634 will treat it as both sequence and
mapping, meaning it has to try every pattern. That prevents the
important (IMO) optimization of checking the kind only once.

Classes that are both mappings and sequences are ill-conceived. Let's not compromise semantics or optimizability to support these. (IOW I agree with Mark here.)

Cheers,
Mark.

>
> Not using the Sequence and Mapping ABCs is good IMO. I'm not aware of
> other core language features requiring the use of ABCs. In SymPy we
> have specifically avoided them because they slow down isinstance
> checking (this is measurable in the time taken to run the whole test
> suite). Using the ABCs in PEP 634 seems surprising given that the
> original pattern matching PEP actually listed the performance impact
> of isinstance checks as part of the opening motivation. Maybe the ABCs
> can be made faster but either way using them like this seems not in
> keeping with the rest of the language.

I am fine with changing this one aspect of PEP 634. IIRC having separate SEQUENCE and MAPPING flags just for matching didn't occur to us during the design, and we strongly preferred some kind of type-based check over checking the presence of a specific attribute like `key`.

--Guido van Rossum (python.org/~guido)

Pronouns: he/him (why is my pronoun here?)