Re: PEP 622 version 2 (Structural Pattern Matching)
Much of the discussion seems to focus on how to distinguish between a variable as a provider of a value and a variable as receiver of a matched value. In normal Python syntax a variable in an expression provides a value, please let’s keep that unchanged. So it seems to me we should explicitly mark a variable to receive a matched value. I have seen ‘?’ suggested as a prefix to do this, ‘\’ would also do fine. This would solve the single variable issue, too: case foo: matches the value of ‘foo’, while case \foo: matches anything and stores it in ‘foo’. This would also mean case Point(x=\x, y=\y): should be used to obtain x and y from the Point instance.
On Thu, Jul 9, 2020 at 1:42 PM Eric Nieuwland <eric.nieuwland@gmail.com> wrote:
Much of the discussion seems to focus on how to distinguish between a variable as a provider of a value and a variable as receiver of a matched value.
In normal Python syntax a variable in an expression provides a value, please let’s keep that unchanged.
For patterns, these are no different than parameters for a function (either a lambda expression or with `def`); or target assignments in unpacking assignments. So just like I wouldn't wonder where `a` and `b` materialized in the parameters for the function definition below def sum2(a, b): return a + b I think it will be straightforward to understand this in the context of a `case` using a capture pattern: match x: case (a, b): return a + b ... (This commonality between cases and function definitions is further used in Scala for example, but I don't see that approach for defining an idea of partial functions -- not like functools.partial functions! -- as being that useful in Python.)
So it seems to me we should explicitly mark a variable to receive a matched value. I have seen ‘?’ suggested as a prefix to do this, ‘\’ would also do fine.
This would solve the single variable issue, too: case foo: matches the value of ‘foo’, while case \foo: matches anything and stores it in ‘foo’.
Explicit namespacing (if a constant) or using a guard (if a variable) seems to be the right solution, as Ethan demonstrated earlier. No need for . or ^ or \ or ... to disambiguate. Also it seems to me that structural pattern matching will build on two common usages of namespaces for constants: 1. Constants used from other modules are almost always used in the module namespace. Eg, socket.AF_UNIX or signal.SIGTERM. 2. New code often tends to use constants defined within an Enum namespace. Hopefully we will see more of this convention in usage. (Very much an aside: Interestingly with the socket module we see both used - it defines its constants with IntEnum and exports them traditionally. The namespace specifics it uses with IntEnum._convert_ to make this happen -- strictly speaking EnumMeta._convert, not documented, and a bit hard to follow -- might be possibly debatable, but it works out quite well in practice in providing backwards compatibility while continuing to work with a C source of these constants.)
This would also mean case Point(x=\x, y=\y): should be used to obtain x and y from the Point instance. _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/3MC2ZKDV... Code of Conduct: http://python.org/psf/codeofconduct/
On 10 Jul 2020, at 01:51, Jim Baker <jim.baker@python.org> wrote:
On Thu, Jul 9, 2020 at 1:42 PM Eric Nieuwland <eric.nieuwland@gmail.com <mailto:eric.nieuwland@gmail.com>> wrote: Much of the discussion seems to focus on how to distinguish between a variable as a provider of a value and a variable as receiver of a matched value.
In normal Python syntax a variable in an expression provides a value, please let’s keep that unchanged.
For patterns, these are no different than parameters for a function (either a lambda expression or with `def`); or target assignments in unpacking assignments. So just like I wouldn't wonder where `a` and `b` materialized in the parameters for the function definition below
def sum2(a, b): return a + b
I think it will be straightforward to understand this in the context of a `case` using a capture pattern:
match x: case (a, b): return a + b ...
(This commonality between cases and function definitions is further used in Scala for example, but I don't see that approach for defining an idea of partial functions -- not like functools.partial functions! -- as being that useful in Python.)
So it seems to me we should explicitly mark a variable to receive a matched value. I have seen ‘?’ suggested as a prefix to do this, ‘\’ would also do fine.
This would solve the single variable issue, too: case foo: matches the value of ‘foo’, while case \foo: matches anything and stores it in ‘foo’.
Explicit namespacing (if a constant) or using a guard (if a variable) seems to be the right solution, as Ethan demonstrated earlier. No need for . or ^ or \ or ... to disambiguate. Also it seems to me that structural pattern matching will build on two common usages of namespaces for constants:
1. Constants used from other modules are almost always used in the module namespace. Eg, socket.AF_UNIX or signal.SIGTERM. 2. New code often tends to use constants defined within an Enum namespace. Hopefully we will see more of this convention in usage.
(Very much an aside: Interestingly with the socket module we see both used - it defines its constants with IntEnum and exports them traditionally. The namespace specifics it uses with IntEnum._convert_ to make this happen -- strictly speaking EnumMeta._convert, not documented, and a bit hard to follow -- might be possibly debatable, but it works out quite well in practice in providing backwards compatibility while continuing to work with a C source of these constants.)
This would also mean case Point(x=\x, y=\y): should be used to obtain x and y from the Point instance.
This approach makes deeper nesting of the structure much more cumbersome, I think. How to match Polygon(Point(x0,y0), Point(x1, y1), Point(x2, y2)) based on its structure? And Polygon(Point(x0,y0), p1, Point(x2, y2))?
On Fri, Jul 10, 2020, 9:16 AM Eric Nieuwland <eric.nieuwland@gmail.com> wrote:
On 10 Jul 2020, at 01:51, Jim Baker <jim.baker@python.org> wrote:
... Explicit namespacing (if a constant) or using a guard (if a variable) seems to be the right solution, as Ethan demonstrated earlier. No need for . or ^ or \ or ... to disambiguate. Also it seems to me that structural pattern matching will build on two common usages of namespaces for constants:
1. Constants used from other modules are almost always used in the module namespace. Eg, socket.AF_UNIX or signal.SIGTERM. 2. New code often tends to use constants defined within an Enum namespace. Hopefully we will see more of this convention in usage.
(Very much an aside: Interestingly with the socket module we see both used - it defines its constants with IntEnum and exports them traditionally. The namespace specifics it uses with IntEnum._convert_ to make this happen -- strictly speaking EnumMeta._convert, not documented, and a bit hard to follow -- might be possibly debatable, but it works out quite well in practice in providing backwards compatibility while continuing to work with a C source of these constants.)
This would also mean case Point(x=\x, y=\y): should be used to obtain x and y from the Point instance.
This approach makes deeper nesting of the structure much more cumbersome, I think.
How to match Polygon(Point(x0,y0), Point(x1, y1), Point(x2, y2)) based on its structure? And Polygon(Point(x0,y0), p1, Point(x2, y2))?
I'm just trying to describe what v2 of the PEP is trying to do and how it then corresponds to a reasonable usage model. Sorry for any confusion. So in your scenario above, Polygon and Point are used as class patterns ( https://www.python.org/dev/peps/pep-0622/#class-patterns). Consequently they are treated accordingly and have that nice structural pattern matching quality! Earlier I was discussing constant patterns ( https://www.python.org/dev/peps/pep-0622/#constant-value-patterns), which require they be namespaced in some way (a qualified name as it is described in the PEP). - Jim
On 10 Jul 2020, at 18:28, Jim Baker <jim.baker@python.org> wrote:
On Fri, Jul 10, 2020, 9:16 AM Eric Nieuwland <eric.nieuwland@gmail.com> wrote:
On 10 Jul 2020, at 01:51, Jim Baker <jim.baker@python.org> wrote: ... Explicit namespacing (if a constant) or using a guard (if a variable) seems to be the right solution, as Ethan demonstrated earlier. No need for . or ^ or \ or ... to disambiguate. Also it seems to me that structural pattern matching will build on two common usages of namespaces for constants:
1. Constants used from other modules are almost always used in the module namespace. Eg, socket.AF_UNIX or signal.SIGTERM. 2. New code often tends to use constants defined within an Enum namespace. Hopefully we will see more of this convention in usage.
(Very much an aside: Interestingly with the socket module we see both used - it defines its constants with IntEnum and exports them traditionally. The namespace specifics it uses with IntEnum._convert_ to make this happen -- strictly speaking EnumMeta._convert, not documented, and a bit hard to follow -- might be possibly debatable, but it works out quite well in practice in providing backwards compatibility while continuing to work with a C source of these constants.)
This would also mean case Point(x=\x, y=\y): should be used to obtain x and y from the Point instance.
This approach makes deeper nesting of the structure much more cumbersome, I think.
How to match Polygon(Point(x0,y0), Point(x1, y1), Point(x2, y2)) based on its structure? And Polygon(Point(x0,y0), p1, Point(x2, y2))?
I'm just trying to describe what v2 of the PEP is trying to do and how it then corresponds to a reasonable usage model. Sorry for any confusion.
Yes, I understood. Thank you for that. No apology needed.
So in your scenario above, Polygon and Point are used as class patterns (https://www.python.org/dev/peps/pep-0622/#class-patterns). Consequently they are treated accordingly and have that nice structural pattern matching quality!
What I meant to say is that as I read the current PEP text there would be a confusing difference between match poly: case Polygon(Point(x0, y0), Point(x1, y1), Point(x2, y2)): ... and p0 = Point(x0, y0) p1 = Point(x1, y1) p2 = Point(x2, y2) match poly: case Polygon(p0, p1, p2): ... This would be especially clumsy if I need to match parts in a deep structure. It would require me to either write the whole construction as part of the ‘match’ or use ‘match’ nested to drill down to the parts I need.
Earlier I was discussing constant patterns (https://www.python.org/dev/peps/pep-0622/#constant-value-patterns), which require they be namespaced in some way (a qualified name as it is described in the PEP).
Indeed. My point is this would be - as far as I know - the first time you need to create a namespace to use the value of an already known variable. This only to allow assignment to variables which I find counterintuitive and which IMHO leads to clumsy constructions, as shown above. So I hope the new and special thing here (i.e. assign matched parts of the structure to variables) will not interfere with how we read expressions in Python. A special indicator for the special use case to me seems far easier to understand and to teach. —eric
On 11 Jul 2020, at 21:03, Eric Nieuwland <eric.nieuwland@gmail.com> wrote:
What I meant to say is that as I read the current PEP text there would be a confusing difference between
match poly: case Polygon(Point(x0, y0), Point(x1, y1), Point(x2, y2)): ...
and
p0 = Point(x0, y0) p1 = Point(x1, y1) p2 = Point(x2, y2) match poly: case Polygon(p0, p1, p2): ...
This would be especially clumsy if I need to match parts in a deep structure. It would require me to either write the whole construction as part of the ‘match’ or use ‘match’ nested to drill down to the parts I need.
Just after I hit ‘send’ it dawned on me it might be preferable to make that match poly: p0 = Point(x0, y0) p1 = Point(x1, y1) p2 = Point(x2, y2) case Polygon(p0, p1, p2): … so the part preceded by ‘match’ is the preparation phase for matching. This could also resolve the discussion on indentation of the ‘case’ parts and the placement of the default matching: match <expression> [as <var>]: <preparation statements> case <pattern> [<guard>]: <statements> … [else: <statements>] within the preparation statements it would then be allowed to use undefined variables as receivers of matched parts.
On 12/07/20 7:13 am, Eric Nieuwland wrote:
match poly: p0 = Point(x0, y0) p1 = Point(x1, y1) p2 = Point(x2, y2) case Polygon(p0, p1, p2): …
Interesting idea, but what happens if you *don't* need any setup? Do you have to write match poly: pass case ... ? -- Greg
On 11/07/2020 20:13, Eric Nieuwland wrote:
Just after I hit ‘send’ it dawned on me it might be preferable to make that
match poly: p0 = Point(x0, y0) p1 = Point(x1, y1) p2 = Point(x2, y2) case Polygon(p0, p1, p2): …
so the part preceded by ‘match’ is the preparation phase for matching.
Are you intending p0, p1 and p2 to be subpatterns rather than object instantiations? That makes me a little twitchy; the difference between what you wrote and: match poly: p0 = Point(x0, y0) p1 = Point(x1, y1) case Polygon(p0, p1, p2): ... is very easy to miss. -- Rhodri James *-* Kynesim Ltd
Just had another thought about marking assignment targets. The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that case Point(x, x): would match only if the two arguments were equal. But if assignment targets were marked, we could write this as case Point(?x, x): and it would work as expected. -- Greg
On Sun, Jul 12, 2020 at 10:30 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Just had another thought about marking assignment targets.
The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that
case Point(x, x):
would match only if the two arguments were equal.
But if assignment targets were marked, we could write this as
case Point(?x, x):
and it would work as expected.
Hang on. Matching happens before assignment, so this should use the previous value of x for the matching. At least, that's my understanding. If you do something like: case Point(x, 2): it won't assign x unless the second coordinate is 2, right? ChrisA
On Sat, Jul 11, 2020 at 5:58 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Jul 12, 2020 at 10:30 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Just had another thought about marking assignment targets.
The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that
case Point(x, x):
would match only if the two arguments were equal.
But if assignment targets were marked, we could write this as
case Point(?x, x):
and it would work as expected.
Hang on. Matching happens before assignment, so this should use the previous value of x for the matching. At least, that's my understanding. If you do something like:
case Point(x, 2):
it won't assign x unless the second coordinate is 2, right?
Good catch. That's actually undefined -- we want to let the optimizer have some leeway in how to generate the best code for matching. See https://www.python.org/dev/peps/pep-0622/#performance-considerations Currently it doesn't optimize all that much -- it just processes patterns from left to right: ```
match Point(3, 3): ... case Point(x, 42): pass ... print(x) 3
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Sun, Jul 12, 2020 at 11:04 AM Guido van Rossum <guido@python.org> wrote:
On Sat, Jul 11, 2020 at 5:58 PM Chris Angelico <rosuav@gmail.com> wrote:
On Sun, Jul 12, 2020 at 10:30 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Just had another thought about marking assignment targets.
The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that
case Point(x, x):
would match only if the two arguments were equal.
But if assignment targets were marked, we could write this as
case Point(?x, x):
and it would work as expected.
Hang on. Matching happens before assignment, so this should use the previous value of x for the matching. At least, that's my understanding. If you do something like:
case Point(x, 2):
it won't assign x unless the second coordinate is 2, right?
Good catch. That's actually undefined -- we want to let the optimizer have some leeway in how to generate the best code for matching. See https://www.python.org/dev/peps/pep-0622/#performance-considerations
Currently it doesn't optimize all that much -- it just processes patterns from left to right: ```
match Point(3, 3): ... case Point(x, 42): pass ... print(x) 3
Ah, okay. My "obvious" intuition was that this wouldn't assign, and Greg's equally "obvious" intuition was that it would. I think that disagreement should be a strike against the "Point(?x, x)" notation - I can't be the only person who would misinterpret it. ChrisA
On 12/07/2020 02:04, Guido van Rossum wrote:
On Sat, Jul 11, 2020 at 5:58 PM Chris Angelico <rosuav@gmail.com <mailto:rosuav@gmail.com>> wrote:
Hang on. Matching happens before assignment, so this should use the previous value of x for the matching. At least, that's my understanding. If you do something like:
case Point(x, 2):
it won't assign x unless the second coordinate is 2, right?
Good catch. That's actually undefined -- we want to let the optimizer have some leeway in how to generate the best code for matching. See https://www.python.org/dev/peps/pep-0622/#performance-considerations
Currently it doesn't optimize all that much -- it just processes patterns from left to right: ```
match Point(3, 3): ... case Point(x, 42): pass ... print(x) 3
If I've understood this correctly, my immediate reaction is one of horror. I'd assumed that a case that failed to match would have no effect. Rob Cliffe
On 2020-07-12 01:32, Chris Angelico wrote:
On Sun, Jul 12, 2020 at 10:30 AM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Just had another thought about marking assignment targets.
The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that
case Point(x, x):
would match only if the two arguments were equal.
But if assignment targets were marked, we could write this as
case Point(?x, x):
and it would work as expected.
Hang on. Matching happens before assignment, so this should use the previous value of x for the matching. At least, that's my understanding. If you do something like:
case Point(x, 2):
it won't assign x unless the second coordinate is 2, right?
Presumably the assumption is that it would use a local dict for binding, faling back to the actual dict if necessary for lookup, and then update the actual dict if the match is successful. That way, unsuccessful matches won't pollute the actual dict.
On Sat, Jul 11, 2020 at 5:28 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Just had another thought about marking assignment targets.
The PEP currently forbids repeating bound names in a pattern to avoid raising expectations that
case Point(x, x):
would match only if the two arguments were equal.
But if assignment targets were marked, we could write this as
case Point(?x, x):
and it would work as expected.
Yes. And if instead we marked name loads, e.g. with `^name`, we could write it as ``` case Point(x, ^x): ``` In fact, Elixir's "pin" operator is `^` and works this way. I don't find it very intuitive that in order to write "it should be the same x twice" you have to spell it differently -- it's more a clever trick (that surely would become a hacker's idiom if we allowed it). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
participants (8)
-
Chris Angelico
-
Eric Nieuwland
-
Greg Ewing
-
Guido van Rossum
-
Jim Baker
-
MRAB
-
Rhodri James
-
Rob Cliffe