Pattern Matching Syntax
Hi Everyone, Never posted in here before, so I hope that I'm not violating any particular procedure for intros or something. Over time, there have been various switch or match statement proposal; some that have gotten as far as PEPs: 2001 Nov - https://www.python.org/dev/peps/pep-0275/ 2006 Jun - https://www.python.org/dev/peps/pep-3103/ 2014 Apr - https://groups.google.com/d/msg/python-ideas/J5O562NKQMY/DrMHwncrmIIJ 2016 May - https://groups.google.com/d/msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ However, I don't see that the conversation ever really resolved, so I'd like restart the conversation on some kind of pattern matching syntax in Python. The main objections I've seen are in the following buckets: - One--and Preferably Only One--Obvious Way. Basically, we have if/elif and that's all we need, so this is syntactical sugar bloat. I'd submit that there are specific cases where this kind of syntax would be the obviously correct way to do something - Specific Syntax Objections. There have been several specific objections that usually come down to "unreadable" or "ugly", which are subjective statements that don't really bring any good way to continue the discussion in a productive manner. I cannot handle all syntax objections ahead of time, but I can handle the "only way" objection. At high level, pattern matching provides similar syntactical sugar to list comprehensions. We could argue that they are unnecessary since we have for loops. But more importantly, pattern matching is powerful for what it restricts you to. More specifically: - Assignment. Many of the implementations offer the ability to immediately assign the value from the matching pattern. However, assignment is prevented in the middle of all of the patterns, which is possible in if/elif. - No Fall Through. Once a pattern is matched, there's no way to break to try another branch. Prevents having to look at multiple cases to figure out how something resolved. If/elif can have this happen, of course, but even more confusing sometimes breaks will be mixed with returns or other control flows, which makes figuring how large if/elifs are resolved. - Automatic Unpacking. Some implementations offer the ability unpack a dictionary equivalent automatically into keys or select ranges of values like slicing. Compared to if/elif, this is tremendously more DRY than doing the "does the key exists?" and then "what is that keys value?" - Guards. Often times you can embed another check to go along with the simple pattern matching. Absolutely possible with if/elif, but crucially are implementations generally happen after the pattern check. Again, keeps code DRY and improves readability. I figured maybe a good way to continue the discussion is to offer a straw-man example syntax: # Simple pattern matching x = 1 number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything" print(number) # one # Final Pattern that matches anything x = 3 number = match x: 1 => "one" 2 => "two" _ => "anything" print(number) # anything # Pattern matching without any match returns None number = match x: 1 => "one" 2 => "two" print(number) # None # Pattern matching with guards x = 'three' number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything" print(number) # The string is three # Pattern matching with multiple values x = 1 number = match x: 1, 2, 3, 4 => "one to four" _ => "anything" print(number) # one to four # Pattern matching with types x = 1. number = match x: x:int => f'{x} is a int' x:float => f'{x} is a float' x:str => f'{x} is a string' print(number) # x is a float # Supports destructuring dicts x = {'foo': 1} number = match x: {'foo': 1} => "foo is 1" _ => "anything" print(number) # foo is 1 # Supports binding with destructuring dicts x = {'foo': 1, 'bar': 2} number = match x: {'foo': y} => f'got foo {y}' {'bar': z} => f'got bar {z}' {'foo': y, 'bar': z} => f'got foo {y} and bar {z}' _ => "anything" print(number) # got foo 1 and bar 2 # Supports destructuring other types too class Point(): def __init__(self, x, y): self.x = x self.y = y point = Point(1,2) number = match point: Point(x,y) => f'point has an x of {x} and y of {y}' _ => "anything" print(number) # point has an x of 1 and y of 2 As a continued defense for this specific syntax choixe, lets see how two other languages with this feature handle it. I'm going to try to offer as nearly as possible similar examples. Scala https://docs.scala-lang.org/tour/pattern-matching.html val x: Int = 1 def makeMatch(x: Any) = x match { case 1 => "one" case 2 => "two" case _ => "anything" } val number = makeMatch(x) Rust https://doc.rust-lang.org/1.5.0/book/match.html let x = 1; let number = match x { 1 => "one", 2 => "two", _ => "anything", } And for the sake of completeness, here are other languages with similar syntax features and their associated documentation F# https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma... Elixir https://elixir-lang.org/getting-started/case-cond-and-if.html Clojure https://github.com/clojure/core.match/wiki/Basic-usage JavaScript (ES2018?) https://github.com/tc39/proposal-pattern-matching Haskell https://en.wikibooks.org/wiki/Haskell/Pattern_matching Swift https://developer.apple.com/library/content/documentation/Swift/Conceptual/S...
What about instead of number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything" number = { 1 => "one" 2 => "two" 3 => "three" 10 => "ten" }.get(x, "anything") No magic syntax with blocks starting inside an assignment, just to start with. On 3 May 2018 at 09:41, Robert Roskam <raiderrobert@gmail.com> wrote:
Hi Everyone,
Never posted in here before, so I hope that I'm not violating any particular procedure for intros or something.
Over time, there have been various switch or match statement proposal; some that have gotten as far as PEPs:
2001 Nov - https://www.python.org/dev/peps/pep-0275/
2006 Jun - https://www.python.org/dev/peps/pep-3103/
2014 Apr - https://groups.google.com/d/msg/python-ideas/J5O562NKQMY/DrMHwncrmIIJ
2016 May - https://groups.google.com/d/msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ
However, I don't see that the conversation ever really resolved, so I'd like restart the conversation on some kind of pattern matching syntax in Python.
The main objections I've seen are in the following buckets:
One--and Preferably Only One--Obvious Way. Basically, we have if/elif and that's all we need, so this is syntactical sugar bloat. I'd submit that there are specific cases where this kind of syntax would be the obviously correct way to do something Specific Syntax Objections. There have been several specific objections that usually come down to "unreadable" or "ugly", which are subjective statements that don't really bring any good way to continue the discussion in a productive manner.
I cannot handle all syntax objections ahead of time, but I can handle the "only way" objection. At high level, pattern matching provides similar syntactical sugar to list comprehensions. We could argue that they are unnecessary since we have for loops. But more importantly, pattern matching is powerful for what it restricts you to. More specifically:
Assignment. Many of the implementations offer the ability to immediately assign the value from the matching pattern. However, assignment is prevented in the middle of all of the patterns, which is possible in if/elif. No Fall Through. Once a pattern is matched, there's no way to break to try another branch. Prevents having to look at multiple cases to figure out how something resolved. If/elif can have this happen, of course, but even more confusing sometimes breaks will be mixed with returns or other control flows, which makes figuring how large if/elifs are resolved. Automatic Unpacking. Some implementations offer the ability unpack a dictionary equivalent automatically into keys or select ranges of values like slicing. Compared to if/elif, this is tremendously more DRY than doing the "does the key exists?" and then "what is that keys value?" Guards. Often times you can embed another check to go along with the simple pattern matching. Absolutely possible with if/elif, but crucially are implementations generally happen after the pattern check. Again, keeps code DRY and improves readability.
I figured maybe a good way to continue the discussion is to offer a straw-man example syntax:
# Simple pattern matching x = 1
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
print(number) # one
# Final Pattern that matches anything x = 3
number = match x: 1 => "one" 2 => "two" _ => "anything"
print(number) # anything
# Pattern matching without any match returns None number = match x: 1 => "one" 2 => "two"
print(number) # None
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything"
print(number) # The string is three
# Pattern matching with multiple values x = 1
number = match x: 1, 2, 3, 4 => "one to four" _ => "anything"
print(number) # one to four
# Pattern matching with types x = 1.
number = match x: x:int => f'{x} is a int' x:float => f'{x} is a float' x:str => f'{x} is a string'
print(number) # x is a float
# Supports destructuring dicts
x = {'foo': 1}
number = match x: {'foo': 1} => "foo is 1" _ => "anything"
print(number) # foo is 1
# Supports binding with destructuring dicts
x = {'foo': 1, 'bar': 2}
number = match x: {'foo': y} => f'got foo {y}' {'bar': z} => f'got bar {z}' {'foo': y, 'bar': z} => f'got foo {y} and bar {z}' _ => "anything"
print(number) # got foo 1 and bar 2
# Supports destructuring other types too
class Point(): def __init__(self, x, y): self.x = x self.y = y
point = Point(1,2)
number = match point: Point(x,y) => f'point has an x of {x} and y of {y}' _ => "anything"
print(number) # point has an x of 1 and y of 2
As a continued defense for this specific syntax choixe, lets see how two other languages with this feature handle it. I'm going to try to offer as nearly as possible similar examples.
Scala https://docs.scala-lang.org/tour/pattern-matching.html
val x: Int = 1
def makeMatch(x: Any) = x match { case 1 => "one" case 2 => "two" case _ => "anything" }
val number = makeMatch(x)
Rust https://doc.rust-lang.org/1.5.0/book/match.html
let x = 1;
let number = match x { 1 => "one", 2 => "two", _ => "anything", }
And for the sake of completeness, here are other languages with similar syntax features and their associated documentation
F# https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...
Elixir https://elixir-lang.org/getting-started/case-cond-and-if.html
Clojure https://github.com/clojure/core.match/wiki/Basic-usage
JavaScript (ES2018?) https://github.com/tc39/proposal-pattern-matching
Haskell https://en.wikibooks.org/wiki/Haskell/Pattern_matching
Swifthttps://developer.apple.com/library/content/documentation/Swift/Conceptual/S...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
(sorry - my aboe message is about using a dictionary - the "=>" weird tokens should j=be just plain ":" - the point is that Python açready has syntax to do what is asked) On 3 May 2018 at 10:15, Joao S. O. Bueno <jsbueno@python.org.br> wrote:
What about instead of
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
number = { 1 => "one" 2 => "two" 3 => "three" 10 => "ten" }.get(x, "anything")
No magic syntax with blocks starting inside an assignment, just to start with.
On 3 May 2018 at 09:41, Robert Roskam <raiderrobert@gmail.com> wrote:
Hi Everyone,
Never posted in here before, so I hope that I'm not violating any particular procedure for intros or something.
Over time, there have been various switch or match statement proposal; some that have gotten as far as PEPs:
2001 Nov - https://www.python.org/dev/peps/pep-0275/
2006 Jun - https://www.python.org/dev/peps/pep-3103/
2014 Apr - https://groups.google.com/d/msg/python-ideas/J5O562NKQMY/DrMHwncrmIIJ
2016 May - https://groups.google.com/d/msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ
However, I don't see that the conversation ever really resolved, so I'd like restart the conversation on some kind of pattern matching syntax in Python.
The main objections I've seen are in the following buckets:
One--and Preferably Only One--Obvious Way. Basically, we have if/elif and that's all we need, so this is syntactical sugar bloat. I'd submit that there are specific cases where this kind of syntax would be the obviously correct way to do something Specific Syntax Objections. There have been several specific objections that usually come down to "unreadable" or "ugly", which are subjective statements that don't really bring any good way to continue the discussion in a productive manner.
I cannot handle all syntax objections ahead of time, but I can handle the "only way" objection. At high level, pattern matching provides similar syntactical sugar to list comprehensions. We could argue that they are unnecessary since we have for loops. But more importantly, pattern matching is powerful for what it restricts you to. More specifically:
Assignment. Many of the implementations offer the ability to immediately assign the value from the matching pattern. However, assignment is prevented in the middle of all of the patterns, which is possible in if/elif. No Fall Through. Once a pattern is matched, there's no way to break to try another branch. Prevents having to look at multiple cases to figure out how something resolved. If/elif can have this happen, of course, but even more confusing sometimes breaks will be mixed with returns or other control flows, which makes figuring how large if/elifs are resolved. Automatic Unpacking. Some implementations offer the ability unpack a dictionary equivalent automatically into keys or select ranges of values like slicing. Compared to if/elif, this is tremendously more DRY than doing the "does the key exists?" and then "what is that keys value?" Guards. Often times you can embed another check to go along with the simple pattern matching. Absolutely possible with if/elif, but crucially are implementations generally happen after the pattern check. Again, keeps code DRY and improves readability.
I figured maybe a good way to continue the discussion is to offer a straw-man example syntax:
# Simple pattern matching x = 1
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
print(number) # one
# Final Pattern that matches anything x = 3
number = match x: 1 => "one" 2 => "two" _ => "anything"
print(number) # anything
# Pattern matching without any match returns None number = match x: 1 => "one" 2 => "two"
print(number) # None
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything"
print(number) # The string is three
# Pattern matching with multiple values x = 1
number = match x: 1, 2, 3, 4 => "one to four" _ => "anything"
print(number) # one to four
# Pattern matching with types x = 1.
number = match x: x:int => f'{x} is a int' x:float => f'{x} is a float' x:str => f'{x} is a string'
print(number) # x is a float
# Supports destructuring dicts
x = {'foo': 1}
number = match x: {'foo': 1} => "foo is 1" _ => "anything"
print(number) # foo is 1
# Supports binding with destructuring dicts
x = {'foo': 1, 'bar': 2}
number = match x: {'foo': y} => f'got foo {y}' {'bar': z} => f'got bar {z}' {'foo': y, 'bar': z} => f'got foo {y} and bar {z}' _ => "anything"
print(number) # got foo 1 and bar 2
# Supports destructuring other types too
class Point(): def __init__(self, x, y): self.x = x self.y = y
point = Point(1,2)
number = match point: Point(x,y) => f'point has an x of {x} and y of {y}' _ => "anything"
print(number) # point has an x of 1 and y of 2
As a continued defense for this specific syntax choixe, lets see how two other languages with this feature handle it. I'm going to try to offer as nearly as possible similar examples.
Scala https://docs.scala-lang.org/tour/pattern-matching.html
val x: Int = 1
def makeMatch(x: Any) = x match { case 1 => "one" case 2 => "two" case _ => "anything" }
val number = makeMatch(x)
Rust https://doc.rust-lang.org/1.5.0/book/match.html
let x = 1;
let number = match x { 1 => "one", 2 => "two", _ => "anything", }
And for the sake of completeness, here are other languages with similar syntax features and their associated documentation
F# https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...
Elixir https://elixir-lang.org/getting-started/case-cond-and-if.html
Clojure https://github.com/clojure/core.match/wiki/Basic-usage
JavaScript (ES2018?) https://github.com/tc39/proposal-pattern-matching
Haskell https://en.wikibooks.org/wiki/Haskell/Pattern_matching
Swifthttps://developer.apple.com/library/content/documentation/Swift/Conceptual/S...
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hey Joao, Thanks for providing me feedback on this idea! For the simplistic example at that you select, yes, you absolutely can do this as it stands atm. However, the examples I provided further along aren't as easily accomplished, nor is something like this: x = -1 result = match x: x:int if x > 0 => 'greater than 0' x:int, x:float if x == 0 => 'equal to 0' x:int if x < 0 => 'less than 0' print(result) # 'less than 0' Accomplishing the above with a just a dictionary would be not be the current Pythonic solution, imo, you'd do it with if/elif: x = -1 result = None if type(x) is int and x > 0: result = 'greater than 0' elif (type(x) is int or type(x) is float) and x == 0: result = 'greater than 0' elif type(x) is int and x < 0: result = 'greater than 0' print(result) # 'less than 0' So yes, Python has the syntax to handle these problems. However, my point is there's value in the kind of feature I'm proposing, and the value is stated in the above proposal. If the specific syntax choice `=>` offends the sensibilities, I simply chose mine because a good number of other languages already use `=>`. I've also considered the following: -> to then case And I think it would be good to hear if anyone has a specific preference for those others. On Thu, May 3, 2018 at 9:16 AM Joao S. O. Bueno <jsbueno@python.org.br> wrote:
(sorry - my aboe message is about using a dictionary - the "=>" weird tokens should j=be just plain ":" - the point is that Python açready has syntax to do what is asked)
What about instead of
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
number = { 1 => "one" 2 => "two" 3 => "three" 10 => "ten" }.get(x, "anything")
No magic syntax with blocks starting inside an assignment, just to start with.
On 3 May 2018 at 09:41, Robert Roskam <raiderrobert@gmail.com> wrote:
Hi Everyone,
Never posted in here before, so I hope that I'm not violating any
procedure for intros or something.
Over time, there have been various switch or match statement proposal; some that have gotten as far as PEPs:
2001 Nov - https://www.python.org/dev/peps/pep-0275/
2006 Jun - https://www.python.org/dev/peps/pep-3103/
2014 Apr - https://groups.google.com/d/msg/python-ideas/J5O562NKQMY/DrMHwncrmIIJ
2016 May - https://groups.google.com/d/msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ
However, I don't see that the conversation ever really resolved, so I'd
restart the conversation on some kind of pattern matching syntax in Python.
The main objections I've seen are in the following buckets:
One--and Preferably Only One--Obvious Way. Basically, we have if/elif and that's all we need, so this is syntactical sugar bloat. I'd submit that there are specific cases where this kind of syntax would be the obviously correct way to do something Specific Syntax Objections. There have been several specific objections
usually come down to "unreadable" or "ugly", which are subjective statements that don't really bring any good way to continue the discussion in a productive manner.
I cannot handle all syntax objections ahead of time, but I can handle
"only way" objection. At high level, pattern matching provides similar syntactical sugar to list comprehensions. We could argue that they are unnecessary since we have for loops. But more importantly, pattern matching is powerful for what it restricts you to. More specifically:
Assignment. Many of the implementations offer the ability to immediately assign the value from the matching pattern. However, assignment is
in the middle of all of the patterns, which is possible in if/elif. No Fall Through. Once a pattern is matched, there's no way to break to
On 3 May 2018 at 10:15, Joao S. O. Bueno <jsbueno@python.org.br> wrote: particular like that the prevented try
another branch. Prevents having to look at multiple cases to figure out how something resolved. If/elif can have this happen, of course, but even more confusing sometimes breaks will be mixed with returns or other control flows, which makes figuring how large if/elifs are resolved. Automatic Unpacking. Some implementations offer the ability unpack a dictionary equivalent automatically into keys or select ranges of values like slicing. Compared to if/elif, this is tremendously more DRY than doing the "does the key exists?" and then "what is that keys value?" Guards. Often times you can embed another check to go along with the simple pattern matching. Absolutely possible with if/elif, but crucially are implementations generally happen after the pattern check. Again, keeps code DRY and improves readability.
I figured maybe a good way to continue the discussion is to offer a straw-man example syntax:
# Simple pattern matching x = 1
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
print(number) # one
# Final Pattern that matches anything x = 3
number = match x: 1 => "one" 2 => "two" _ => "anything"
print(number) # anything
# Pattern matching without any match returns None number = match x: 1 => "one" 2 => "two"
print(number) # None
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything"
print(number) # The string is three
# Pattern matching with multiple values x = 1
number = match x: 1, 2, 3, 4 => "one to four" _ => "anything"
print(number) # one to four
# Pattern matching with types x = 1.
number = match x: x:int => f'{x} is a int' x:float => f'{x} is a float' x:str => f'{x} is a string'
print(number) # x is a float
# Supports destructuring dicts
x = {'foo': 1}
number = match x: {'foo': 1} => "foo is 1" _ => "anything"
print(number) # foo is 1
# Supports binding with destructuring dicts
x = {'foo': 1, 'bar': 2}
number = match x: {'foo': y} => f'got foo {y}' {'bar': z} => f'got bar {z}' {'foo': y, 'bar': z} => f'got foo {y} and bar {z}' _ => "anything"
print(number) # got foo 1 and bar 2
# Supports destructuring other types too
class Point(): def __init__(self, x, y): self.x = x self.y = y
point = Point(1,2)
number = match point: Point(x,y) => f'point has an x of {x} and y of {y}' _ => "anything"
print(number) # point has an x of 1 and y of 2
As a continued defense for this specific syntax choixe, lets see how two other languages with this feature handle it. I'm going to try to offer as nearly as possible similar examples.
Scala https://docs.scala-lang.org/tour/pattern-matching.html
val x: Int = 1
def makeMatch(x: Any) = x match { case 1 => "one" case 2 => "two" case _ => "anything" }
val number = makeMatch(x)
Rust https://doc.rust-lang.org/1.5.0/book/match.html
let x = 1;
let number = match x { 1 => "one", 2 => "two", _ => "anything", }
And for the sake of completeness, here are other languages with similar syntax features and their associated documentation
F#
https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-ma...
Elixir https://elixir-lang.org/getting-started/case-cond-and-if.html
Clojure https://github.com/clojure/core.match/wiki/Basic-usage
JavaScript (ES2018?) https://github.com/tc39/proposal-pattern-matching
Haskell https://en.wikibooks.org/wiki/Haskell/Pattern_matching
Swifthttps://
developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/Patterns.html
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, May 4, 2018 at 12:29 AM, Robert Roskam <raiderrobert@gmail.com> wrote:
Hey Joao,
Thanks for providing me feedback on this idea!
For the simplistic example at that you select, yes, you absolutely can do this as it stands atm. However, the examples I provided further along aren't as easily accomplished, nor is something like this:
x = -1
result = match x: x:int if x > 0 => 'greater than 0' x:int, x:float if x == 0 => 'equal to 0' x:int if x < 0 => 'less than 0'
print(result) # 'less than 0'
Accomplishing the above with a just a dictionary would be not be the current Pythonic solution, imo, you'd do it with if/elif:
Correct. So the best way to 'sell' this idea is NOT the simple examples, as they're going to be just as simple with a dictionary. I'd like to see a complete definition of the valid comparisons. Some of your examples are based on equality ("5 => ..." matches the number 5, presumably whether it's an int or a float) and have no variable, others use annotation-like syntax and types to presumably do an isinstance check, and then there's some conditions that I'm not sure about. What are all the options and how would each one be written? How do you combine different options? Is the 'if x == 0' part a modifier to a previous comparison, or is it a separate chained comparison? How does this work? ChrisA
On 5/3/2018 9:16 AM, Joao S. O. Bueno wrote:
(sorry - my aboe message is about using a dictionary - the "=>" weird tokens should j=be just plain ":" - the point is that Python açready has syntax to do what is asked)
On 3 May 2018 at 10:15, Joao S. O. Bueno <jsbueno@python.org.br> wrote:
What about instead of
number = match x: 1 => "one" 2 => "two" 3 => "three" 10 => "ten" _ => "anything"
number = { 1 => "one" 2 => "two" 3 => "three" 10 => "ten" }.get(x, "anything")
No magic syntax with blocks starting inside an assignment, just to start with.
This was my initial response until I read the further examples that cannot be done with a dict. -- Terry Jan Reedy
On Thu, May 3, 2018 at 2:41 PM, Robert Roskam <raiderrobert@gmail.com> wrote:
And for the sake of completeness, here are other languages with similar syntax features and their associated documentation [...]
Still for the sake of completeness, and without any judgement from me at this point, a couple more, which are more related to Python: Coconut: http://coconut.readthedocs.io/en/master/DOCS.html#match Mochi: https://github.com/i2y/mochi#pattern-matching S. -- Stefane Fermigier - http://fermigier.com/ - http://twitter.com/sfermigier - http://linkedin.com/in/sfermigier Founder & CEO, Abilian - Enterprise Social Software - http://www.abilian.com/ Chairman, Free&OSS Group @ Systematic Cluster - http://www.gt-logiciel-libre.org/ Co-Chairman, National Council for Free & Open Source Software (CNLL) - http://cnll.fr/ Founder & Organiser, PyParis & PyData Paris - http://pyparis.org/ & http://pydata.fr/
"Stéfane" == Stéfane Fermigier <sf@fermigier.com> writes:
Stéfane> On Thu, May 3, 2018 at 2:41 PM, Robert Roskam <raiderrobert@gmail.com> Stéfane> wrote: >> >> And for the sake of completeness, here are other languages with similar >> syntax features and their associated documentation [...] >> Stéfane> Still for the sake of completeness, and without any judgement from me at Stéfane> this point, a couple more, which are more related to Python: Stéfane> Coconut: http://coconut.readthedocs.io/en/master/DOCS.html#match Stéfane> Mochi: https://github.com/i2y/mochi#pattern-matching There's also macropy http://macropy3.readthedocs.io/en/latest/pattern.html -- Alberto Berti - Information Technology Consultant "gutta cavat lapidem"
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything"
print(number) # The string is three
I think you meant to use isinstance(y, str) ? This looks like an incomplete ternary as well, missing the else clause, so it wouldn't be a valid expression either way. And a NameError. y is never defined anywhere. Since there's no name bound to the variable being matched, would that mean a new keyword ? Also, in the rest you seem to be testing "x == {matchidentifier}", but here it suddenly looks like a boolean True would trigger a match ? And if so, would other boolean truthy values also trigger one, making the entire construct rather....limited ? It looks like you're attempting to suggest at least 3 new language syntax elements here. 4 if I count the type matching of "x:int", which you could sell as type annotation, if those hadn't been explicitly optional and ignored when actually binding the names. And almost every other example can be solved with a dict and .get(). The remainder uses a dict as match, and still work on if/elif perfectly fine. Also, I'd say "but other people do it" isn't a valid reason for implementation. There's plenty people doing stupid things, that doesn't mean it's a good idea to do it to. If they idea can't stand on it's own, it's not worth it.
From the syntax corner, it also doesn't really look like Python to me.
(my apologies if I sound a bit hostile. I've attempted 3 rewrites to get that out. I only really tried to look at the syntax with what I suppose is it's intended meaning here.) 2018-05-03 15:36 GMT+02:00 Alberto Berti <alberto@metapensiero.it>:
"Stéfane" == Stéfane Fermigier <sf@fermigier.com> writes:
Stéfane> On Thu, May 3, 2018 at 2:41 PM, Robert Roskam <raiderrobert@gmail.com> Stéfane> wrote: >> >> And for the sake of completeness, here are other languages with similar >> syntax features and their associated documentation [...] >> Stéfane> Still for the sake of completeness, and without any judgement from me at Stéfane> this point, a couple more, which are more related to Python:
Stéfane> Coconut: http://coconut.readthedocs.io/en/master/DOCS.html#match
Stéfane> Mochi: https://github.com/i2y/mochi#pattern-matching
There's also macropy http://macropy3.readthedocs.io/en/latest/pattern.html -- Alberto Berti - Information Technology Consultant
"gutta cavat lapidem"
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 2018-05-03 15:02, Jacco van Dorp wrote:
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything"
print(number) # The string is three
I think you meant to use isinstance(y, str) ? This looks like an incomplete ternary as well, missing the else clause, so it wouldn't be a valid expression either way. And a NameError. y is never defined anywhere. Since there's no name bound to the variable being matched, would that mean a new keyword ? Also, in the rest you seem to be testing "x == {matchidentifier}", but here it suddenly looks like a boolean True would trigger a match ? And if so, would other boolean truthy values also trigger one, making the entire construct rather....limited ?
It looks like you're attempting to suggest at least 3 new language syntax elements here. 4 if I count the type matching of "x:int", which you could sell as type annotation, if those hadn't been explicitly optional and ignored when actually binding the names.
It's a proposal for new syntax. I suspect that you're trying to read the left-hand side of the match cases as Python expressions. They're a distinct thing: unbound names like 'y' are an essential component of any non-trivial destructuring pattern match, as opposed to an error in an expression. I believe the intention in the example you quoted is syntax something like: <match-case> ::= <pattern> | <pattern> "if" <expression> where the expression is a guard expression evaluated in the context of the matched pattern. IOW, it could be written like this, too: number = match x: 1 if True => "one" y if isinstance(y, str) => f'The string is {y}' _ if True => "anything" I do see a lot of room for bikeshedding around the specific spelling. I'm going to try to resist the temptation ;)
And almost every other example can be solved with a dict and .get(). The remainder uses a dict as match, and still work on if/elif perfectly fine.
How about this? def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1)) versus: def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1)) Of course the latter *works* (sort of: implementing tetration by recursively adding ones never really works) but it's excessively verbose, it's easy to make mistakes when writing it out, and at least in my opinion it's harder to see what it does. It also raises some small but annoying questions, like "do I use if or elif?", "do I use an if for the last case too?", "do I nest the b == 0 cases?", and, if we had used an if for the last case, what we do if we get to the end anyway.
Also, I'd say "but other people do it" isn't a valid reason for implementation. There's plenty people doing stupid things, that doesn't mean it's a good idea to do it to. If they idea can't stand on it's own, it's not worth it.
From the syntax corner, it also doesn't really look like Python to me.
I agree, but I'm sure someone can come up with something prettier.
On 03/05/18 18:18, Ed Kellett wrote:
It's a proposal for new syntax.
I snipped the rest because fundamentally you have failed to explain your new syntax in any clear way. You've given examples of varying levels of complexity but failed to explain what any of them should actually do in words. It wasn't even obvious from your introduction that you were talking about match *expressions* rather than switch statements. Sorry, but this is too unclear to comment on at the moment. -- Rhodri James *-* Kynesim Ltd
On 2018-05-03 18:53, Rhodri James wrote:
On 03/05/18 18:18, Ed Kellett wrote:
It's a proposal for new syntax.
I snipped the rest because fundamentally you have failed to explain your new syntax in any clear way. You've given examples of varying levels of complexity but failed to explain what any of them should actually do in words. It wasn't even obvious from your introduction that you were talking about match *expressions* rather than switch statements.
Sorry, but this is too unclear to comment on at the moment.
It's not my new syntax.
On Fri, May 4, 2018 at 3:18 AM, Ed Kellett <e+python-ideas@kellett.im> wrote:
I believe the intention in the example you quoted is syntax something like:
<match-case> ::= <pattern> | <pattern> "if" <expression>
where the expression is a guard expression evaluated in the context of the matched pattern.
IOW, it could be written like this, too:
number = match x: 1 if True => "one" y if isinstance(y, str) => f'The string is {y}' _ if True => "anything"
I do see a lot of room for bikeshedding around the specific spelling. I'm going to try to resist the temptation ;)
Okay, let me try to tease apart your example. 1) A literal matches anything that compares equal to that value. 2) A name matches anything at all, and binds it to that name. 2a) An underscore matches anything at all. It's just a name, and follows a common convention. 3) "if cond" modifies the prior match; if the condition evaluates as falsey, the match does not match. 4) As evidenced below, a comma-separated list of comparisons matches a tuple with as many elements, and each element must match. Ultimately, this has to be a series of conditions, so this is effectively a syntax for an elif tree as an expression. For another example, here's a way to use inequalities to pick a numeric formatting: display = match number: x if x < 1e3: f"{number}" x if x < 1e6: f"{number/1e3} thousand" x if x < 1e9: f"** {number/1e6} million **" x if x < 1e12: f"an incredible {number/1e9} billion" _: "way WAY too many" I guarantee you that people are going to ask for this to be spelled simply "< 1e3" instead of having the "x if x" part. :)
How about this?
def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1))
versus:
def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1))
I have no idea what this is actually doing, and it looks like a port of Haskell code. I'd want to rewrite it as a 'while' loop with maybe one level of recursion in it, instead of two. (Zero would be better, but I think that's not possible. Maybe?) Is this something that you do a lot of? Is the tuple (n, a, b) meaningful as a whole, or are the three values independently of interest? Not sure how this is useful without a lot more context. ChrisA
Calculating the Ackermann function as Knuth up-arrows really has little practical user. The first few values are well known, the rest won't be calculated before the heat death of the universe. On Thu, May 3, 2018, 2:02 PM Chris Angelico <rosuav@gmail.com> wrote:
I believe the intention in the example you quoted is syntax something
On Fri, May 4, 2018 at 3:18 AM, Ed Kellett <e+python-ideas@kellett.im> wrote: like:
<match-case> ::= <pattern> | <pattern> "if" <expression>
where the expression is a guard expression evaluated in the context of the matched pattern.
IOW, it could be written like this, too:
number = match x: 1 if True => "one" y if isinstance(y, str) => f'The string is {y}' _ if True => "anything"
I do see a lot of room for bikeshedding around the specific spelling. I'm going to try to resist the temptation ;)
Okay, let me try to tease apart your example.
1) A literal matches anything that compares equal to that value. 2) A name matches anything at all, and binds it to that name. 2a) An underscore matches anything at all. It's just a name, and follows a common convention. 3) "if cond" modifies the prior match; if the condition evaluates as falsey, the match does not match. 4) As evidenced below, a comma-separated list of comparisons matches a tuple with as many elements, and each element must match.
Ultimately, this has to be a series of conditions, so this is effectively a syntax for an elif tree as an expression.
For another example, here's a way to use inequalities to pick a numeric formatting:
display = match number: x if x < 1e3: f"{number}" x if x < 1e6: f"{number/1e3} thousand" x if x < 1e9: f"** {number/1e6} million **" x if x < 1e12: f"an incredible {number/1e9} billion" _: "way WAY too many"
I guarantee you that people are going to ask for this to be spelled simply "< 1e3" instead of having the "x if x" part. :)
How about this?
def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1))
versus:
def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1))
I have no idea what this is actually doing, and it looks like a port of Haskell code. I'd want to rewrite it as a 'while' loop with maybe one level of recursion in it, instead of two. (Zero would be better, but I think that's not possible. Maybe?) Is this something that you do a lot of? Is the tuple (n, a, b) meaningful as a whole, or are the three values independently of interest?
Not sure how this is useful without a lot more context.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Hey Chris, So I started extremely generally with my syntax, but it seems like I should provide a lot more examples of real use. Examples are hard. Here's my hastily put together example from an existing piece of production code: # Existing Production Code from datetime import timedelta, date from django.utils import timezone def convert_time_to_timedelta(unit:str, amount:int, now:date): if unit in ['days', 'hours', 'weeks']: return timedelta(**{unit: amount}) elif unit == 'months': return timedelta(days=30 * amount) elif unit == 'years': return timedelta(days=365 * amount) elif unit == 'cal_years': return now - now.replace(year=now.year - amount) # New Syntax for same problem def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: 'days', 'hours', 'weeks' => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount) On Thursday, May 3, 2018 at 2:02:54 PM UTC-4, Chris Angelico wrote:
I believe the intention in the example you quoted is syntax something
On Fri, May 4, 2018 at 3:18 AM, Ed Kellett <e+pytho...@kellett.im <javascript:>> wrote: like:
<match-case> ::= <pattern> | <pattern> "if" <expression>
where the expression is a guard expression evaluated in the context of the matched pattern.
IOW, it could be written like this, too:
number = match x: 1 if True => "one" y if isinstance(y, str) => f'The string is {y}' _ if True => "anything"
I do see a lot of room for bikeshedding around the specific spelling. I'm going to try to resist the temptation ;)
Okay, let me try to tease apart your example.
1) A literal matches anything that compares equal to that value. 2) A name matches anything at all, and binds it to that name. 2a) An underscore matches anything at all. It's just a name, and follows a common convention. 3) "if cond" modifies the prior match; if the condition evaluates as falsey, the match does not match. 4) As evidenced below, a comma-separated list of comparisons matches a tuple with as many elements, and each element must match.
Ultimately, this has to be a series of conditions, so this is effectively a syntax for an elif tree as an expression.
For another example, here's a way to use inequalities to pick a numeric formatting:
display = match number: x if x < 1e3: f"{number}" x if x < 1e6: f"{number/1e3} thousand" x if x < 1e9: f"** {number/1e6} million **" x if x < 1e12: f"an incredible {number/1e9} billion" _: "way WAY too many"
I guarantee you that people are going to ask for this to be spelled simply "< 1e3" instead of having the "x if x" part. :)
How about this?
def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1))
versus:
def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1))
I have no idea what this is actually doing, and it looks like a port of Haskell code. I'd want to rewrite it as a 'while' loop with maybe one level of recursion in it, instead of two. (Zero would be better, but I think that's not possible. Maybe?) Is this something that you do a lot of? Is the tuple (n, a, b) meaningful as a whole, or are the three values independently of interest?
Not sure how this is useful without a lot more context.
ChrisA _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, May 4, 2018 at 4:36 AM, Robert Roskam <raiderrobert@gmail.com> wrote:
Hey Chris,
So I started extremely generally with my syntax, but it seems like I should provide a lot more examples of real use. Examples are hard. Here's my hastily put together example from an existing piece of production code:
# New Syntax for same problem
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: 'days', 'hours', 'weeks' => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
Okay, here we may have a problem. You're expecting a comma separated set of values to indicate "any of these", but elsewhere, pattern matching against a list of values is making an assertion about a tuple. So if you have any pattern matching that isn't based on equality, you're going to need to clearly stipulate how your syntax works. If you are NOT going to support tuple pattern matching (but only dict), you'll need to make this VERY clear, because people are going to expect it. ChrisA
Hey Chris, Thanks for bringing that up! Before submitting this, I actually had the syntax for multiple matches for one arm being separated by or. And frankly I just didn't like how that looked for more than 3 items: '1' or '2' or '3' or '4' or '5' vs '1', '2', '3', '4', '5' But you're right. The syntax should be for tuples instead. Here's my revised syntax, using a guard instead for the moment: def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount) On Thursday, May 3, 2018 at 2:54:24 PM UTC-4, Chris Angelico wrote:
On Fri, May 4, 2018 at 4:36 AM, Robert Roskam <raider...@gmail.com <javascript:>> wrote:
Hey Chris,
So I started extremely generally with my syntax, but it seems like I should provide a lot more examples of real use. Examples are hard. Here's my hastily put together example from an existing piece of production code:
# New Syntax for same problem
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: 'days', 'hours', 'weeks' => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
Okay, here we may have a problem. You're expecting a comma separated set of values to indicate "any of these", but elsewhere, pattern matching against a list of values is making an assertion about a tuple. So if you have any pattern matching that isn't based on equality, you're going to need to clearly stipulate how your syntax works.
If you are NOT going to support tuple pattern matching (but only dict), you'll need to make this VERY clear, because people are going to expect it.
ChrisA _______________________________________________ Python-ideas mailing list Python...@python.org <javascript:> https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, May 4, 2018 at 5:02 AM, Robert Roskam <raiderrobert@gmail.com> wrote:
Hey Chris,
Thanks for bringing that up! Before submitting this, I actually had the syntax for multiple matches for one arm being separated by or. And frankly I just didn't like how that looked for more than 3 items:
'1' or '2' or '3' or '4' or '5' vs '1', '2', '3', '4', '5'
But you're right. The syntax should be for tuples instead.
Agreed.
Here's my revised syntax, using a guard instead for the moment:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
And then this comes down to the same as all the other comparisons - the "x if x" gets duplicated. So maybe it would be best to describe this thus: match <expr> : <expr> | (<comp_op> <expr>) => <expr> If it's just an expression, it's equivalent to a comp_op of '=='. The result of evaluating the match expression is then used as the left operand for ALL the comparisons. So you could write your example as: return match unit: in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount) Then there's room to expand that to a comma-separated list of values, which would pattern-match a tuple. ChrisA
On 2018-05-03 20:17, Chris Angelico wrote:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
And then this comes down to the same as all the other comparisons - the "x if x" gets duplicated. So maybe it would be best to describe this thus:
match <expr> : <expr> | (<comp_op> <expr>) => <expr>
If it's just an expression, it's equivalent to a comp_op of '=='. The result of evaluating the match expression is then used as the left operand for ALL the comparisons. So you could write your example as:
return match unit: in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
Then there's room to expand that to a comma-separated list of values, which would pattern-match a tuple.
I believe there are some problems with this approach. That case uses no destructuring at all, so the syntax that supports destructuring looks clumsy. In general, if you want to support something like: match spec: (None, const) => const (env, fmt) if env => fmt.format(**env) then I think something like the 'if' syntax is essential for guards. One could also imagine cases where it'd be useful to guard on more involved properties of things: match number_ish: x:str if x.lower().startswith('0x') => int(x[2:], 16) x:str => int(x) x => x #yolo (I know base=0 exists, but let's imagine we're implementing base=0, or something). I'm usually against naming things, and deeply resent having to name the x in [x for x in ... if ...] and similar constructs. But in this specific case, where destructuring is kind of the point, I don't think there's much value in compromising that to avoid a name. I'd suggest something like this instead: return match unit: _ in {'days', 'hours', 'weeks'} => timedelta(**{unit: amount}) ... So a match entry would be one of: - A pattern. See below - A pattern followed by "if" <expr>, e.g.: (False, x) if len(x) >= 7 - A comparison where the left-hand side is a pattern, e.g.: _ in {'days', 'hours', 'weeks'} Where a pattern is one of: - A display of patterns, e.g.: {'key': v, 'ignore': _} I think *x and **x should be allowed here. - A comma-separated list of patterns, making a tuple - A pattern enclosed in parentheses - A literal (that is not a formatted string literal, for sanity) - A name - A name with a type annotation To give a not-at-all-motivating but hopefully illustrative example: return match x: (0, _) => None (n, x) if n < 32 => ', '.join([x] * n) x:str if len(x) <= 5 => x x:str => x[:2] + '...' n:Integral < 32 => '!' * n Where: (0, 'blorp') would match the first case, yielding None (3, 'hello') would match the second case, yielding "hello, hello, hello" 'frogs' would match the third case, yielding "frogs" 'frogs!' would match the fourth case, yielding "fr..." 3 would match the fifth case, yielding '!!!' I think the matching process would mostly be intuitive, but one detail that might raise some questions: (x, x) could be allowed, and it'd make a lot of sense for that to match only (1, 1), (2, 2), ('hi', 'hi'), etc. But that'd make the _ convention less useful unless it became more than a convention. All in all, I like this idea, but I think it might be a bit too heavy to get into Python. It has the feel of requiring quite a lot of new things.
Would this be valid? # Pattern matching with guards x = 'three' number = match x: 1 => "one" y if y is str => f'The string is {y}' z if z is int => f'the int is {z}' _ => "anything" print(number) # The string is three If so, why are y and z both valid here ? Is the match variable rebound to any other ? Or even to all names ? ofc, you could improve the clarity here with: number = match x as y: or any variant thereof. This way, you'd explicitely bind the variable you use for testing. If you don't, the interpreter would never know which ones to treat as rebindings and which to draw from surrounding scopes, if any. I also haven't really seen a function where this would be better than existing syntax, and the above is the only one to actually try something not possible with dicts. The type checking one could better be: x = 1 d = { int:"integer", float:"float", str:"str" } d.get(type(x), None) The production datetime code could be: def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return { "days":timedelta(**{unit: amount}), "hours":timedelta(**{unit: amount}), "weeks":timedelta(**{unit: amount}), # why not something like subtracting two dates here to get an accurate timedelta for your specific interval ? "months":timedelta(days = 30*amount), # days = (365.25 / 12)*amount ? Would be a lot more accurate for average month length. (30.4375) "years":timedelta(days=365*amount), # days = 365.25*amount ? "cal_years":timedelta(now - now.replace(year=now.year - amount)), }.get(unit) I honestly don't see the advantages of new syntax here. Unless you hate the eager evaluation in the dict literal getting indexed, so if it's performance critical an if/else might be better. But I can't see a match statement outperforming if/else. (and if you really need faster than if/else, you should perhaps move that bit of code to C or something.) 2018-05-04 0:34 GMT+02:00 Ed Kellett <e+python-ideas@kellett.im>:
On 2018-05-03 20:17, Chris Angelico wrote:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
And then this comes down to the same as all the other comparisons - the "x if x" gets duplicated. So maybe it would be best to describe this thus:
match <expr> : <expr> | (<comp_op> <expr>) => <expr>
If it's just an expression, it's equivalent to a comp_op of '=='. The result of evaluating the match expression is then used as the left operand for ALL the comparisons. So you could write your example as:
return match unit: in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
Then there's room to expand that to a comma-separated list of values, which would pattern-match a tuple.
I believe there are some problems with this approach. That case uses no destructuring at all, so the syntax that supports destructuring looks clumsy. In general, if you want to support something like:
match spec: (None, const) => const (env, fmt) if env => fmt.format(**env)
then I think something like the 'if' syntax is essential for guards.
One could also imagine cases where it'd be useful to guard on more involved properties of things:
match number_ish: x:str if x.lower().startswith('0x') => int(x[2:], 16) x:str => int(x) x => x #yolo
(I know base=0 exists, but let's imagine we're implementing base=0, or something).
I'm usually against naming things, and deeply resent having to name the x in [x for x in ... if ...] and similar constructs. But in this specific case, where destructuring is kind of the point, I don't think there's much value in compromising that to avoid a name.
I'd suggest something like this instead:
return match unit: _ in {'days', 'hours', 'weeks'} => timedelta(**{unit: amount}) ...
So a match entry would be one of: - A pattern. See below - A pattern followed by "if" <expr>, e.g.: (False, x) if len(x) >= 7 - A comparison where the left-hand side is a pattern, e.g.: _ in {'days', 'hours', 'weeks'}
Where a pattern is one of: - A display of patterns, e.g.: {'key': v, 'ignore': _} I think *x and **x should be allowed here. - A comma-separated list of patterns, making a tuple - A pattern enclosed in parentheses - A literal (that is not a formatted string literal, for sanity) - A name - A name with a type annotation
To give a not-at-all-motivating but hopefully illustrative example:
return match x: (0, _) => None (n, x) if n < 32 => ', '.join([x] * n) x:str if len(x) <= 5 => x x:str => x[:2] + '...' n:Integral < 32 => '!' * n
Where: (0, 'blorp') would match the first case, yielding None (3, 'hello') would match the second case, yielding "hello, hello, hello" 'frogs' would match the third case, yielding "frogs" 'frogs!' would match the fourth case, yielding "fr..." 3 would match the fifth case, yielding '!!!'
I think the matching process would mostly be intuitive, but one detail that might raise some questions: (x, x) could be allowed, and it'd make a lot of sense for that to match only (1, 1), (2, 2), ('hi', 'hi'), etc. But that'd make the _ convention less useful unless it became more than a convention.
All in all, I like this idea, but I think it might be a bit too heavy to get into Python. It has the feel of requiring quite a lot of new things.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Note that most languages that you mentioned as references are functional (so they don't have a statement/expression distinction like Python has), and those that are not, have matching statements. The only exception is Javascript, but in Javascript the distinction is not that hard given that it has the idiom (function() {stmt; stmt; stmt})() to have any statement block as an expression. And again, as I mentioned it's an outlier. Other imperative languages like C, Java, have of course switch statements which are similar Making a quick search for real code that could benefit for this, I mostly found situations where a matching *statement* would be required instead of a matching *expression*. To give you the examples I found in the stdlib for Python3.6 (I grepped for "elif" and looked for "similar" branches manually, covering the first ~20%): fnmatch.translate (match c: ... string options) telnetlib.Telnet.process_rawq (match len(self.iacseq): ... integer options) mimetypes[module __main__ body] (match opt: ... multiple str options per match) typing._remove_dups_flatten (match p: ... isinstance checks + custom condition) [this *could* be an expression with some creativity] typing.GenericMeta.__getitem__ (match self: ... single and multiple type options by identity) turtle.Shape.__init__ (match type_:... str options) turtle.TurtleScreen._color (match len(cstr): ... int options) turtle.TurtleScreen.colormode (match cmode: ... mixed type options) turtle.TNavigator.distance (match x: ... isinstance checks) [could be an expression] turtle.TNavigator.towards (match x: ... isinstance checks) [could be an expression] turtle.TPen.color (match l: ... integer options. l is set to len(args) the line before) turtle._TurtleImage._setshape (match self._type: ... str options) [could be an expression] turtle.RawTurtle.__init__ (match canvas: ... isinstance checks) turtle.RawTurtle.clone (match ttype: ... str options) [ could be an expression] turtle.RawTurtle._getshapepoly (match self._resizemode: ... str options, one with a custom condition or'ed) turtle.RawTurtle._drawturtle (match ttype: ... str options) turtle.RawTurtle.stamp (match ttype: ... str options) turtle.RawTurtle._undo (match action: ... str options) ntpath.expandvars (match c: ... str optoins) sre_parse.Subpattern.getwidth (match op: ... nonliteral int constants, actually a NamedIntConstant which subclasses int) sre_parse._class_escape (match c: ... string options with custom conditions, and inclusion+equality mixed) sre_parse._escape (match c: ... string options with custom conditions, and inclusion+equality mixed) sre_parse._parse ( match this: ... string options with in, not in, and equality) sre_parse._parse ( match char: ... string options with in, and equality) sre_parse.parse_template (match c: ... string options with in) netrc.netrc._parse (match tt: ... string options with custom conditions) netrc.netrc._parse (match tt: ... string options with custom conditions) [not a duplicate, there are two possible statements here] argparse.HelpFormatter._format_args (match action.nargs: ... str/None options) [this *could* be an expression with some creativity/transformations] argparse.ArgumentParser._get_nargs_pattern (match nargs: ... str/None options) [could be an expression] argparse.ArgumentParser._get_values (match action.nargs: ... str/None options with extra conditions) _strptime._strptime (match group_key: ... str options) datetime._wrap_strftime (match ch: ... str optoins) pickletools.optimize (match opcode,name: ... str options with reverse inclusion and equiality) json/encoder._make_iterencode(match value: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode dict (match key: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode dict (match value: ... mixed options and isinstance checks) json/encoder._make_iterencode._iterencode (match o: ... mixed options and isinstance checks) json/scanner.py_make_scanner._scan_once (match nextchar: ... str options) [could be turned into expression with some transformation] unittest.mock._Call.__new__ (match _len: ... int options) unittest.mock._Call.eq__ (match len_other: ... int options) (I'm not saying that all these should be match statements, only that they could be). Cases where an expression would solve the issue are somewhat uncommon (there are many state machines, including many string or argument parsers that set state depending on the option, or object builders that grow data structures). An usual situation is that some of the branches need to raise exceptions (and raise in python is a statement, not an expression). This could be workarounded making the default a raise ValueError that can be caught and reraised as soemthing else, but that would end up making the code deeper, and IMO, more complicated. Also, many of the situations where an expression could be used, are string matches where a dictionary lookup would work well anyway. My conclusions for this are: 1. It makes more sense to talk about a statement, not an expression 2. good/clear support for strings, ints and isinstancechecks is essential (other fancier things may help more circumstancially) 3. the "behaviour when there's no match" should be quite flexible. I saw many "do nothing" and many "do something" (with a large part of the latter being "raise an exception") 4. There's a pattern of re-evaluating something on each branch of an if/elif (like len(foo) or self.attr); and also common to create a dummy variable just before the if/elif. This can also be fodder for PEP-572 discussion That's what I have for now On 4 May 2018 at 08:26, Jacco van Dorp <j.van.dorp@deonet.nl> wrote:
Would this be valid?
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' z if z is int => f'the int is {z}' _ => "anything"
print(number) # The string is three
If so, why are y and z both valid here ? Is the match variable rebound to any other ? Or even to all names ?
ofc, you could improve the clarity here with:
number = match x as y:
or any variant thereof. This way, you'd explicitely bind the variable you use for testing. If you don't, the interpreter would never know which ones to treat as rebindings and which to draw from surrounding scopes, if any.
I also haven't really seen a function where this would be better than existing syntax, and the above is the only one to actually try something not possible with dicts. The type checking one could better be:
x = 1 d = { int:"integer", float:"float", str:"str" } d.get(type(x), None)
The production datetime code could be:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return { "days":timedelta(**{unit: amount}), "hours":timedelta(**{unit: amount}), "weeks":timedelta(**{unit: amount}), # why not something like subtracting two dates here to get an accurate timedelta for your specific interval ? "months":timedelta(days = 30*amount), # days = (365.25 / 12)*amount ? Would be a lot more accurate for average month length. (30.4375) "years":timedelta(days=365*amount), # days = 365.25*amount ? "cal_years":timedelta(now - now.replace(year=now.year - amount)), }.get(unit)
I honestly don't see the advantages of new syntax here. Unless you hate the eager evaluation in the dict literal getting indexed, so if it's performance critical an if/else might be better. But I can't see a match statement outperforming if/else. (and if you really need faster than if/else, you should perhaps move that bit of code to C or something.)
2018-05-04 0:34 GMT+02:00 Ed Kellett <e+python-ideas@kellett.im>:
On 2018-05-03 20:17, Chris Angelico wrote:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return match unit: x if x in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
And then this comes down to the same as all the other comparisons - the "x if x" gets duplicated. So maybe it would be best to describe this thus:
match <expr> : <expr> | (<comp_op> <expr>) => <expr>
If it's just an expression, it's equivalent to a comp_op of '=='. The result of evaluating the match expression is then used as the left operand for ALL the comparisons. So you could write your example as:
return match unit: in ('days', 'hours', 'weeks') => timedelta(**{unit: amount}) 'months' => timedelta(days=30 * amount) 'years' => timedelta(days=365 * amount) 'cal_years' => now - now.replace(year=now.year - amount)
Then there's room to expand that to a comma-separated list of values, which would pattern-match a tuple.
I believe there are some problems with this approach. That case uses no destructuring at all, so the syntax that supports destructuring looks clumsy. In general, if you want to support something like:
match spec: (None, const) => const (env, fmt) if env => fmt.format(**env)
then I think something like the 'if' syntax is essential for guards.
One could also imagine cases where it'd be useful to guard on more involved properties of things:
match number_ish: x:str if x.lower().startswith('0x') => int(x[2:], 16) x:str => int(x) x => x #yolo
(I know base=0 exists, but let's imagine we're implementing base=0, or something).
I'm usually against naming things, and deeply resent having to name the x in [x for x in ... if ...] and similar constructs. But in this specific case, where destructuring is kind of the point, I don't think there's much value in compromising that to avoid a name.
I'd suggest something like this instead:
return match unit: _ in {'days', 'hours', 'weeks'} => timedelta(**{unit: amount}) ...
So a match entry would be one of: - A pattern. See below - A pattern followed by "if" <expr>, e.g.: (False, x) if len(x) >= 7 - A comparison where the left-hand side is a pattern, e.g.: _ in {'days', 'hours', 'weeks'}
Where a pattern is one of: - A display of patterns, e.g.: {'key': v, 'ignore': _} I think *x and **x should be allowed here. - A comma-separated list of patterns, making a tuple - A pattern enclosed in parentheses - A literal (that is not a formatted string literal, for sanity) - A name - A name with a type annotation
To give a not-at-all-motivating but hopefully illustrative example:
return match x: (0, _) => None (n, x) if n < 32 => ', '.join([x] * n) x:str if len(x) <= 5 => x x:str => x[:2] + '...' n:Integral < 32 => '!' * n
Where: (0, 'blorp') would match the first case, yielding None (3, 'hello') would match the second case, yielding "hello, hello, hello" 'frogs' would match the third case, yielding "frogs" 'frogs!' would match the fourth case, yielding "fr..." 3 would match the fifth case, yielding '!!!'
I think the matching process would mostly be intuitive, but one detail that might raise some questions: (x, x) could be allowed, and it'd make a lot of sense for that to match only (1, 1), (2, 2), ('hi', 'hi'), etc. But that'd make the _ convention less useful unless it became more than a convention.
All in all, I like this idea, but I think it might be a bit too heavy to get into Python. It has the feel of requiring quite a lot of new things.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk <http://www.machinalis.com> Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987.
On 2018-05-04 08:26, Jacco van Dorp wrote:
Would this be valid?
# Pattern matching with guards x = 'three'
number = match x: 1 => "one" y if y is str => f'The string is {y}' z if z is int => f'the int is {z}' _ => "anything"
print(number) # The string is three
If so, why are y and z both valid here ? Is the match variable rebound to any other ? Or even to all names ?
In the match case here: match x: y if y > 3 => f'{y} is >3' # to use an example that works there are three parts: "y" is a pattern. It specifies the shape of the value to match: in this case, anything at all. Nothing is bound yet. "if" is just the word if, used as a separator, nothing to do with "if" in expressions. "y > 3" is the guard expression for the match case. Iff the pattern matches, "y > 3" is evaluated, with names appearing in the pattern taking the values they matched. It's important to note that the thing on the left-hand side is explicitly *not* a variable. It's a pattern, which can look like a variable, but it could also be a literal or a display.
ofc, you could improve the clarity here with:
number = match x as y:
or any variant thereof. This way, you'd explicitely bind the variable you use for testing. If you don't, the interpreter would never know which ones to treat as rebindings and which to draw from surrounding scopes, if any.
I don't think anything in the pattern should be drawn from surrounding scopes.
I also haven't really seen a function where this would be better than existing syntax, and the above is the only one to actually try something not possible with dicts. The type checking one could better be:
[snip]
The production datetime code could be:
def convert_time_to_timedelta_with_match(unit:str, amount:int, now:date): return { "days":timedelta(**{unit: amount}), "hours":timedelta(**{unit: amount}), "weeks":timedelta(**{unit: amount}), # why not something like subtracting two dates here to get an accurate timedelta for your specific interval ? "months":timedelta(days = 30*amount), # days = (365.25 / 12)*amount ? Would be a lot more accurate for average month length. (30.4375) "years":timedelta(days=365*amount), # days = 365.25*amount ? "cal_years":timedelta(now - now.replace(year=now.year - amount)), }.get(unit)
Don't you think the repetition of ``timedelta(**{unit: amount})'' sort of proves OP's point? Incidentally, there's no need to use the dict trick when the unit is known statically anyway. I can't decide if that would count as more reptition or less.
I honestly don't see the advantages of new syntax here. Unless you hate the eager evaluation in the dict literal getting indexed, so if it's performance critical an if/else might be better. But I can't see a match statement outperforming if/else. (and if you really need faster than if/else, you should perhaps move that bit of code to C or something.)
It's not really about performance. It's about power. A bunch of if statements can do many things--anything, arguably--but their generality translates into repetition when dealing with many instances of this family of cases.
Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement. I just noticed the very similar proposal for JavaScript linked to by the OP: https://github.com/tc39/proposal-pattern-matching -- this is more relevant than what's done in e.g. F# or Swift because Python and JS are much closer. (Possibly Elixir is also relevant, it seems the JS proposal borrows from that.) A larger topic may be how to reach decisions. If I've learned one thing from PEP 572 it's that we need to adjust how we discuss and evaluate proposals. I'll think about this and start a discussion at the Language Summit about this. -- --Guido van Rossum (python.org/~guido)
[Guido]
Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement.
Just to be annoying ;-) , I liked the way he _reached_ that conclusion: by looking at real-life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples. An observation: syntax borrowed from functional languages often fails to work well in practice when grafted onto a language that's statement-oriented - it only works well for the expression subset of the language. and even then just for when that subset is being used in a functional way (e.g., the expression `object.method(arg)` is usually used for its side effects, not for its typically-None return value). OTOH, syntax borrowed from a statement-oriented language usually fails to work at all when grafted onto an "almost everything's an expression" language. So that's an abstract argument of my own, but - according to me - should be given almost no weight unless confirmed by examining realistic code. Daniel did some of both - great!
... A larger topic may be how to reach decisions. If I've learned one thing from PEP 572 it's that we need to adjust how we discuss and evaluate proposals. I'll think about this and start a discussion at the Language Summit about this.
Python needs something akin to a dictator, who tells people how things are going to be, like it or not. But a benevolent dictator, not an evil one. And to prevent palace intrigue, they should hold that position for life. Just thinking outside the box there ;-)
04.05.18 20:48, Tim Peters пише:
[Guido]
Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement.
Just to be annoying ;-) , I liked the way he _reached_ that conclusion: by looking at real-life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples.
I would like to see such examination for PEP 572. And for all other syntax changing ideas. I withdrew some my ideas and patches when my examinations showed that the number of cases in the stdlib that will take a benefit from rewriting using a new feature or from applying a compiler optimization is not large enough.
[Tim]
... I liked the way he _reached_ that conclusion: by looking at real- life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples.
[Serhiy Storchaka <storchaka@gmail.com>]
I would like to see such examination for PEP 572. And for all other syntax changing ideas.
I did it myself for 572, and posted several times about what I found. It was far more productive to me than arguing (and, indeed, I sat out of the first several hundred msgs on python-ideas entirely because I spent all my time looking at code instead). Short course: I found a small win frequently, a large win rarely, but in most cases wouldn't use it. In all I expect I'd use it significantly more often than ternary "if", but far less often than augmented assignment. But that's me - everybody needs to look at their own code to apply _their_ judgment. 572 is harder than a case/switch statement to consider this way, because virtually every assignment statement binding a name could _potentially_ be changed to a binding expression instead, and there are gazillions of those. For considering case/switch additions, you can automate searches to vastly whittle down the universe of places to look at (`elif` chains, and certain nested if/else if/else if/else ... patterns).
I withdrew some my ideas and patches when my examinations showed that the number of cases in the stdlib that will take a benefit from rewriting using a new feature or from applying a compiler optimization is not large enough.
Good! I approve :-)
05.05.18 09:23, Tim Peters пише:
[Tim]
... I liked the way he _reached_ that conclusion: by looking at real- life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples.
[Serhiy Storchaka <storchaka@gmail.com>]
I would like to see such examination for PEP 572. And for all other syntax changing ideas.
I did it myself for 572, and posted several times about what I found.
Could you please give links to these results? It is hard to find something in hundreds of messages.
[Tim]
... I liked the way he _reached_ that conclusion: by looking at real- life Python code that may have been written instead to use constructs "like this". I find such examination far more persuasive than abstract arguments or made-up examples.
[Serhiy]
I would like to see such examination for PEP 572. And for all other syntax changing ideas.
[Tim]
I did it myself for 572, and posted several times about what I found.
[Serhiy]
Could you please give links to these results? It is hard to find something in hundreds of messages.
It's no easier for me to find old messages, and you'd just ask for more & more anyway ;-) The "short course" I already gave didn't skip anything vital: Short course: I found a small win frequently, a large win rarely, but in most cases wouldn't use it. In all I expect I'd use it significantly more often than ternary "if", but far less often than augmented assignment. More importantly: But that's me - everybody needs to look at their own code to apply _their_ judgment. It's _applying_ the approach I find persuasive & productive, not someone else writing up the results of _their_ taking the approach. I'm not trying to change peoples' minds - just suggesting a more fruitful way (than abstract arguments, fashion, ...) to make up their minds to begin with.
I withdrew some my ideas and patches when my examinations showed that the number of cases in the stdlib that will take a benefit from rewriting using a new feature or from applying a compiler optimization is not large enough.
Bingo! Note your "my examinations" in that. Someone who hasn't done their own examination is basically guessing. They may or may not reach the same conclusions if they did the work, but neither eloquence nor confidence is a reliable predictor of whether they would. Passion may even be negatively correlated ;-)
Hi everyone, I’m also a first time poster to python-ideas so I apologize if reviving a week old thread is bad form. I emailed Guido out of the blue to share some thoughts on the JavaScript pattern matching proposal’s applicability to Python and he encouraged me to post those thoughts here. The best argument for pattern matching is to support what Daniel F Mossat above calls “structural patterns”. These go beyond simple value matching or boolean conditions that are better served with other constructs like if statements. Structural pattern matching allows for reasoning about the shape of data. As a practical example, in my day job I work as a software engineer at a startup that builds quantum computers. Python has been a great language for writing physics experiments and doing numerical simulations. However, our codebase contains a lot of `isinstance` calls due to the need to write converters from the physics experiment definition language to quantum assembly instructions. There’s some examples in our open source code as well, for instance here: https://github.com/rigetticomputing/pyquil/blob/master/pyquil/quil.py#L641 . This could be written more succinctly as: match instr: case Gate(name, params, qubits): result.append(Gate(name, params, [ qubit_mapping[q] for q in qubits]) case Measurement(qubit, classical_reg): result.append(Measurement( qubit_mapping[qubit], classical_reg) else: result.append(instr) Something that I also haven’t seen discussed yet is how well this would work with the new Python 3.7 dataclasses. Dataclasses allow us to create more structured data than using dicts alone. For instance, each instruction in our internal assembly language has its own dataclass. Pattern matching would make it easy to create readable code which loops over a list of these instructions and performs some sort of optimizations/transformations. Finally I want to talk about matching on the structure of built-in data structures like lists and dicts. The javascript proposal does a great job of supporting these data types and I think this would also be natural for Python which also has some destructuring bind support for these types on assignment. Consider the example of accessing nested dictionaries. This comes up a lot when working with JSON. If you had a dictionary like this: target = {'galaxy': {'system': {'planet': 'jupiter'}}} Then trying to access the value ‘jupiter’ would mean one of these alternatives: # Risks getting a ValueError my_planet = target[‘galaxy’][‘system’][‘planet’] print(my_planet) # Awkward to read my_planet = target.get(‘galaxy’, {}).get(‘system’, {}).get(‘planet’) if my_planet: print(my_planet) With pattern matching this could become more simply: match target: case {‘galaxy’: {‘system’: {‘planet’: my_planet}}}: print(my_planet) This example was stolen from the tutorial for the glom library https://sedimental.org/glom_restructured_data.html which works on nested data. Structural pattern matching is a more universal concept and could eliminate the need for these kinds of helper functions. >From reading through this thread (as well as other background reading like https://groups.google.com/forum/#!msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ and the Javascript proposal) a couple things seem clear to me for pattern matching in the case of Python: - it should be statement based - it should have great support for built-in data types like lists, dicts, namedtuples, and dataclasses - it should form a coherent story with other similar Python concepts like unpacking on assignment There are a ton of details to be worked out obviously and we should go slow as Guido suggested. However, I believe that it would be worth doing the work. To that end: if there’s anyone else who’d like to collaborate and come up with a first draft of a more well-defined proposal I would love to commit my time to this, my email is below. Also, I sent this email from PyCon in Cleveland if anyone would like to brainstorm in person :). Steven Heidel Rigetti Quantum Computing steven@rigetti.com On Friday, May 4, 2018 at 4:37:43 PM UTC, Guido van Rossum wrote: > > Can I recommend going slow here? This is a very interesting topic where > many languages have gone before. I liked Daniel F Moisset's analysis about > the choices of a language designer and his conclusion that match should be > a statement. > > I just noticed the very similar proposal for JavaScript linked to by the > OP: https://github.com/tc39/proposal-pattern-matching -- this is more > relevant than what's done in e.g. F# or Swift because Python and JS are > much closer. (Possibly Elixir is also relevant, it seems the JS proposal > borrows from that.) > > A larger topic may be how to reach decisions. If I've learned one thing > from PEP 572 it's that we need to adjust how we discuss and evaluate > proposals. I'll think about this and start a discussion at the Language > Summit about this. > > -- > --Guido van Rossum (python.org/~guido) >
Hey Steven, I'm also at PyCon. Shall we take this off list and attempt to meet up and discuss? On Friday, May 11, 2018 at 12:36:32 PM UTC-4, ste...@rigetti.com wrote:
Hi everyone, I’m also a first time poster to python-ideas so I apologize if reviving a week old thread is bad form. I emailed Guido out of the blue to share some thoughts on the JavaScript pattern matching proposal’s applicability to Python and he encouraged me to post those thoughts here.
The best argument for pattern matching is to support what Daniel F Mossat above calls “structural patterns”. These go beyond simple value matching or boolean conditions that are better served with other constructs like if statements. Structural pattern matching allows for reasoning about the shape of data.
As a practical example, in my day job I work as a software engineer at a startup that builds quantum computers. Python has been a great language for writing physics experiments and doing numerical simulations. However, our codebase contains a lot of `isinstance` calls due to the need to write converters from the physics experiment definition language to quantum assembly instructions. There’s some examples in our open source code as well, for instance here: https://github.com/rigetticomputing/pyquil/blob/master/pyquil/quil.py#L641 . This could be written more succinctly as:
match instr: case Gate(name, params, qubits): result.append(Gate(name, params, [ qubit_mapping[q] for q in qubits]) case Measurement(qubit, classical_reg): result.append(Measurement( qubit_mapping[qubit], classical_reg) else: result.append(instr)
Something that I also haven’t seen discussed yet is how well this would work with the new Python 3.7 dataclasses. Dataclasses allow us to create more structured data than using dicts alone. For instance, each instruction in our internal assembly language has its own dataclass. Pattern matching would make it easy to create readable code which loops over a list of these instructions and performs some sort of optimizations/transformations.
Finally I want to talk about matching on the structure of built-in data structures like lists and dicts. The javascript proposal does a great job of supporting these data types and I think this would also be natural for Python which also has some destructuring bind support for these types on assignment.
Consider the example of accessing nested dictionaries. This comes up a lot when working with JSON. If you had a dictionary like this:
target = {'galaxy': {'system': {'planet': 'jupiter'}}}
Then trying to access the value ‘jupiter’ would mean one of these alternatives:
# Risks getting a ValueError my_planet = target[‘galaxy’][‘system’][‘planet’] print(my_planet)
# Awkward to read my_planet = target.get(‘galaxy’, {}).get(‘system’, {}).get(‘planet’) if my_planet: print(my_planet)
With pattern matching this could become more simply:
match target: case {‘galaxy’: {‘system’: {‘planet’: my_planet}}}: print(my_planet)
This example was stolen from the tutorial for the glom library https://sedimental.org/glom_restructured_data.html which works on nested data. Structural pattern matching is a more universal concept and could eliminate the need for these kinds of helper functions.
From reading through this thread (as well as other background reading like https://groups.google.com/forum/#!msg/python-ideas/aninkpPpEAw/wCQ1IH5mAQAJ and the Javascript proposal) a couple things seem clear to me for pattern matching in the case of Python: - it should be statement based - it should have great support for built-in data types like lists, dicts, namedtuples, and dataclasses - it should form a coherent story with other similar Python concepts like unpacking on assignment
There are a ton of details to be worked out obviously and we should go slow as Guido suggested. However, I believe that it would be worth doing the work. To that end: if there’s anyone else who’d like to collaborate and come up with a first draft of a more well-defined proposal I would love to commit my time to this, my email is below.
Also, I sent this email from PyCon in Cleveland if anyone would like to brainstorm in person :).
Steven Heidel Rigetti Quantum Computing ste...@rigetti.com <javascript:>
On Friday, May 4, 2018 at 4:37:43 PM UTC, Guido van Rossum wrote:
Can I recommend going slow here? This is a very interesting topic where many languages have gone before. I liked Daniel F Moisset's analysis about the choices of a language designer and his conclusion that match should be a statement.
I just noticed the very similar proposal for JavaScript linked to by the OP: https://github.com/tc39/proposal-pattern-matching -- this is more relevant than what's done in e.g. F# or Swift because Python and JS are much closer. (Possibly Elixir is also relevant, it seems the JS proposal borrows from that.)
A larger topic may be how to reach decisions. If I've learned one thing from PEP 572 it's that we need to adjust how we discuss and evaluate proposals. I'll think about this and start a discussion at the Language Summit about this.
-- --Guido van Rossum (python.org/~guido)
On Thu, May 03, 2018 at 11:36:27AM -0700, Robert Roskam wrote:
So I started extremely generally with my syntax, but it seems like I should provide a lot more examples of real use.
Yes, real-life examples will be far more compelling and useful than made up examples and pseudo-code. Also, I think that you should delay talking about syntax until you have explained in plain English what pattern matching does, how it differs from a switch/case statement (in languages that have them) and why it is better than the two major existing idioms in Python: - chained if...elif - dict dispatch. I'll make a start, and you can correct me if I get any of it wrong. (1) Pattern matching takes a value, and compares it to a series of *patterns* until the first match, at which point it returns a specified value, skipping the rest of the patterns. (2) Patterns typically are single values, and the match is by equality, although other kinds of patterns are available as well. (3) Unlike a case/switch statement, there's no implication that the compiler could optimise the order of look-ups; it is purely top to bottom. (4) Unlike if...elif, each branch is limited to a single expression, not a block. That's a feature: a match expression takes an input, and returns a value, and typically we don't have to worry about it having side-effects. So it is intentionally less general than a chain of if...elif blocks. (5) We could think of it as somewhat analogous to a case/switch statement, a dict lookup, or if...elif, only better. (Why is it better?) Here is a list of patterns I would hope to support, off the top of my head: * match by equality; * match by arbitrary predicates such as "greater than X" or "between X and Y"; * match by string prefix, suffix, or substring; * match by type (isinstance). I think that before we start talking about syntax, we need to know what features we need syntax for. There's probably more to it, because so far it doesn't look like anything but a restricted switch statement. Over to someone else with a better idea of why pattern matching has become ubiquitous in functional programming. -- Steve
This email from Steve has some good questions, let me try to help organize ideas: On 4 May 2018 at 13:11, Steven D'Aprano <steve@pearwood.info> wrote:
I'll make a start, and you can correct me if I get any of it wrong.
(1) Pattern matching takes a value, and compares it to a series of *patterns* until the first match, at which point it returns a specified value, skipping the rest of the patterns.
In a general sense based in most other languages, patterns are a syntactic construct that can be "matched" with a value in runtime. The matching process has two effects at once: 1) check that the value has some specific form dictated by the pattern (which can have a yes/no result) 2) bind some assignable targets referenced in the pattern to components of the value matched. The binding is usually done only if there is a match according to (1) Python actually has some form of patterns (called "target_list" in the formal syntax) that are used in assignments, for loops, and other places. As it is mostly restricted to assign single values, or decompose iterables, we normally say "tuple unpacking" instead of "pattern matching". And there's a second type of pattern which is included in the try/except statement, which matches by subtype (and also can bind a name) As a language designer, once you have your notion on matching defined, you can choose which kind of constructs use patterns (I just mentioned left of assignemnts, between "for" and "in", etc in python). Usual constructs are multi branch statement/expression that match a single value between several patterns, and run a branch of code depending on what pattern matched (After performing the corresponding bindings). That's not the only option, you could also implement patterns in other places, like regular assuments, or the conditions of loops and conditionals [resulting in an effect similar to some of the ones being discussed in the PEP572 thread]; although this last sentence is a beyond what the OP was suggesting and a generalization of the idea. (2) Patterns typically are single values, and the match is by equality,
although other kinds of patterns are available as well.
Typical patterns in other languages include: a) single values (matched by equality) b) assignables (names, stuff like mylist[0] or self.color) which match anything and bind the value to assignables c) type patterns (a value matches if the type of the value has a certain supertype) d) structure patterns (a value matches if it has certain structure. For example, being a dict with certain keys, or an iterable of certain amount of elements). These usually are recursive, and components of the structure can be also patterns e) arbitrary boolean conditions (that can use the names bound by other parts of the pattern) Python has support for (b) and (c) in both assignment and for loops. Python supports (b) and (c) in try statements. The proposal for the OP offers expanding to most of these patterns, and implement some sort of pattern matching expression. I argued in another email that a pattern matching statement feels more adequate to Python (I'm not arguing at this point if it's a good idea, just that IF any is a good idea, it's the statement) As an example, you could have a pattern (invented syntax) like "(1, 'foo', bar, z: int)" which would positively match 4-element tuples that have 1 in its first position, foo in its second, and an int instance in the last; when matching it would bind the names "bar" and "z" to the last 2 elements in the tuple.
(3) Unlike a case/switch statement, there's no implication that the compiler could optimise the order of look-ups; it is purely top to bottom.
[we are talking about a multi-branch pattern matching statement now, not just "apttern matching"] In most practical cases, a compiler can do relatively simple static analysis (even in python) that could result in performance improvements. One obvious improvement is that the matched expression can be evaluated once (you can achieve the same effect always with a dummy variable assignment right before the if/elif statement). But for multiple literal string patterns (a common scenario), you can compile a string matcher that is faster than a sequence of equality comparisons (either through hashing+comparison fallback or creating some decision tree that goes through the input string once). For integers you can make lookup tables. Even an ifinstance check choosing between several branches (a not so uncommon occurrence) could be implemented by a faster operation if somewhat considered that relevant.
(4) Unlike if...elif, each branch is limited to a single expression, not a block. That's a feature: a match expression takes an input, and returns a value, and typically we don't have to worry about it having side-effects.
So it is intentionally less general than a chain of if...elif blocks.
That's the OP proposal, yes (as I mentioned, I argued with some simple data that a feature like that is of a more limited use. Of course, I'd love to see deeper analysis with data that proves me wrong, or arguing that what I looked that is irrelevant ;-) )
(5) We could think of it as somewhat analogous to a case/switch statement, a dict lookup, or if...elif, only better.
(Why is it better?)
I'll leave the OP to argue his side here. I've mentioned some opportunities for efficiency (which are IMO secondary) and I understand that there's an argument for readability, especially when using the binding feature.
Here is a list of patterns I would hope to support, off the top of my head:
* match by equality;
* match by arbitrary predicates such as "greater than X" or "between X and Y";
* match by string prefix, suffix, or substring;
* match by type (isinstance).
I think that before we start talking about syntax, we need to know what features we need syntax for.
There's probably more to it, because so far it doesn't look like anything but a restricted switch statement. Over to someone else with a better idea of why pattern matching has become ubiquitous in functional programming.
a multi branch statement tends to be present but it's not necessarily ubiquitous in FP. "pattern matching" as an idea is one of those pseudo-unviersal generalizations of computing that FP language designers love. Essentially it covers with a single thing what we do in python with several different features (assignment, argument passing, conditionals, exception catching, unpacking of data structures, instance checking). It works very well when you use algebraic data types (which are like unions of namedtuples)as your primary data structure, because there are very natural patterns to decompose those. In Python, there's less value to this because well... it already has all these features so adding a unifying concept after the fact doesn't make it simpler, but more complicated. So the main argument to talk about here is if the expressivity added can be of value (if we talk about pattern matching in many places of the language, it *might*) Best, -- Daniel F. Moisset - UK Country Manager - Machinalis Limited www.machinalis.co.uk <http://www.machinalis.com> Skype: @dmoisset T: + 44 7398 827139 1 Fore St, London, EC2Y 9DT Machinalis Limited is a company registered in England and Wales. Registered number: 10574987.
On Sat, May 5, 2018 at 12:45 AM, Daniel Moisset <dmoisset@machinalis.com> wrote:
(3) Unlike a case/switch statement, there's no implication that the compiler could optimise the order of look-ups; it is purely top to bottom.
[we are talking about a multi-branch pattern matching statement now, not just "apttern matching"] In most practical cases, a compiler can do relatively simple static analysis (even in python) that could result in performance improvements. One obvious improvement is that the matched expression can be evaluated once (you can achieve the same effect always with a dummy variable assignment right before the if/elif statement).
That one isn't an optimization, but part of the specification; it is an advantage of the fact that you're writing the match expression just once. But all the rest of your optimizations aren't trustworthy.
But for multiple literal string patterns (a common scenario), you can compile a string matcher that is faster than a sequence of equality comparisons (either through hashing+comparison fallback or creating some decision tree that goes through the input string once).
Only if you're doing equality checks (no substrings or anything else where it might match more than one of them). And if you're doing "pattern matching" that's nothing more than string equality comparisons, a dict is a better way to spell it.
For integers you can make lookup tables.
If they're just equality checks, again, a dict is better. If they're ranges, you would have to ensure that they don't overlap (no problem if they're all literals), and then you could potentially optimize it.
Even an ifinstance check choosing between several branches (a not so uncommon occurrence) could be implemented by a faster operation if somewhat considered that relevant.
Only if you can guarantee that no single object can be an instance of more than one of the types. Otherwise, you still have to check in some particular order. In CPython, you can guarantee that isinstance(x, int) and isinstance(x, str) can't both be true, but that's basically a CPython implementation detail, due to the way C-implemented classes work. You can't use this to dispatch based on exception types, for instance. Let's say you try to separately dispatch ValueError, ZeroDivisionError, and OSError; and then you get this:
class DivisionByOSError(ZeroDivisionError, OSError, ValueError): pass ... raise DivisionByOSError() Traceback (most recent call last): File "<stdin>", line 1, in <module> __main__.DivisionByOSError
That thing really truly is all three of those types, and you have to decide how to dispatch that. So there needs to be an order to the checks, with no optimization. ChrisA
On Fri, May 04, 2018 at 04:01:55AM +1000, Chris Angelico wrote:
On Fri, May 4, 2018 at 3:18 AM, Ed Kellett <e+python-ideas@kellett.im> wrote:
def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1))
versus:
def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1))
I have no idea what this is actually doing
It is the Hyperoperation function. https://en.wikipedia.org/wiki/Hyperoperation n is the parameter that specifies the "level" of the operation, and a, b are the arguments to the operation. hyperop(0, a, b) returns the successor of b (a is ignored) -- e.g. the successor of 1 is 2, the successor of 2 is 3, etc. hyperop(1, a, b) returns a+b (addition, or repeated successor); hyperop(2, a, b) returns a*b (multiplication, or repeated addition); hyperop(3, a, b) returns a**b, or a^b in the more usual mathematical notation (exponentiation, or repeated multiplication); hyperop(4, a, b) returns a^^b (tetration: repeated exponentiation; e.g. 3^^4 = 3^3^3^3 = 3^3^27 = 3^7625597484987 = a moderately large number); hyperop(5, a, b) returns a^^^b (pentation: repeated tetration, and if you thought 3^^4 was big, it's nothing compared to 3^^^4); and so forth. While this is really useful to mathematicians, in practice we're going to run out of memory before being able to calculate any of the larger values. If we converted the *entire universe* into memory, we'd still run out. So while it's a fascinating example for maths geeks, in practical terms we might as well re-write it as: def hyperop(n, a, b): raise MemoryError("you've got to be kidding") which aside from a few values close to zero, is nearly always the right thing to do :-) -- Steve
On Fri, May 4, 2018 at 4:40 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, May 04, 2018 at 04:01:55AM +1000, Chris Angelico wrote:
On Fri, May 4, 2018 at 3:18 AM, Ed Kellett <e+python-ideas@kellett.im> wrote:
def hyperop(n, a, b): return match (n, a, b): (0, _, b) => b + 1 (1, a, 0) => a (2, _, 0) => 0 (_, _, 0) => 1 (n, a, b) => hyperop(n-1, a, hyperop(n, a, b-1))
versus:
def hyperop(n, a, b): if n == 0: return b + 1 if n == 1 and b == 0: return a if n == 2 and b == 0: return 0 if b == 0: return 1 return hyperop(n-1, a, hyperop(n, a, b-1))
I have no idea what this is actually doing
It is the Hyperoperation function.
https://en.wikipedia.org/wiki/Hyperoperation
n is the parameter that specifies the "level" of the operation, and a, b are the arguments to the operation.
hyperop(0, a, b) returns the successor of b (a is ignored) -- e.g. the successor of 1 is 2, the successor of 2 is 3, etc.
hyperop(1, a, b) returns a+b (addition, or repeated successor);
hyperop(2, a, b) returns a*b (multiplication, or repeated addition);
hyperop(3, a, b) returns a**b, or a^b in the more usual mathematical notation (exponentiation, or repeated multiplication);
hyperop(4, a, b) returns a^^b (tetration: repeated exponentiation; e.g. 3^^4 = 3^3^3^3 = 3^3^27 = 3^7625597484987 = a moderately large number);
hyperop(5, a, b) returns a^^^b (pentation: repeated tetration, and if you thought 3^^4 was big, it's nothing compared to 3^^^4);
and so forth.
Oh. So... this is a crazy recursive way to demonstrate that a Python integer really CAN use up all your memory. Cool!
While this is really useful to mathematicians, in practice we're going to run out of memory before being able to calculate any of the larger values. If we converted the *entire universe* into memory, we'd still run out. So while it's a fascinating example for maths geeks, in practical terms we might as well re-write it as:
def hyperop(n, a, b): raise MemoryError("you've got to be kidding")
which aside from a few values close to zero, is nearly always the right thing to do :-)
Got it. Well, I don't see why we can't use Python's existing primitives. def hyperop(n, a, b): if n == 0: return 1 + b if n == 1: return a + b if n == 2: return a * b if n == 3: return a ** b if n == 4: return a *** b if n == 5: return a **** b if n == 6: return a ***** b ... This is a MUCH cleaner way to write it. ChrisA
On 2018-05-03 19:57, Chris Angelico wrote:
Got it. Well, I don't see why we can't use Python's existing primitives.
def hyperop(n, a, b): if n == 0: return 1 + b if n == 1: return a + b if n == 2: return a * b if n == 3: return a ** b if n == 4: return a *** b if n == 5: return a **** b if n == 6: return a ***** b ...
Well, it'd be infinitely long, but I suppose I'd have to concede that that's in line with the general practicality level of the example.
On Thu, May 03, 2018 at 09:04:40PM +0100, Ed Kellett wrote:
On 2018-05-03 19:57, Chris Angelico wrote:
Got it. Well, I don't see why we can't use Python's existing primitives.
def hyperop(n, a, b): if n == 0: return 1 + b if n == 1: return a + b if n == 2: return a * b if n == 3: return a ** b if n == 4: return a *** b if n == 5: return a **** b if n == 6: return a ***** b ...
Well, it'd be infinitely long, but I suppose I'd have to concede that that's in line with the general practicality level of the example.
Yes, but only countably infinite, so at least we can enumerate them all. Eventually :-) And aside from the tiny niggle that *** and higher order operators are syntax errors... Its not a bad example of the syntax, but it would be considerably more compelling a use-case if it were something less obscure and impractical. -- Steve
On 5/3/2018 8:41 AM, Robert Roskam wrote:
However, I don't see that the conversation ever really resolved, so I'd like restart the conversation on some kind of pattern matching syntax in Python.
For the cases not handled by dicts, I believe chained conditional expressions work. """ # Pattern matching with guards x = 'three' number = match x: 1 => "one" y if y is str => f'The string is {y}' _ => "anything" print(number) # The string is three """ Is handled by def f(x): return ('one' if x == 1 else f'The string is {x}' if isinstance(x, str) else 'anything') for x in 1, '2', 3: print(f(x)) I don't like the ordering, but this was Guido's decision.
1, 2, 3, 4 => "one to four"
"one to four' if x in (1,2,3,4)
x:int => f'{x} is a int' x:float => f'{x} is a float' x:str => f'{x} is a string'
tx = type(x) f'{x} is a {tx}' if tx in (int, float, str) else None -- Terry Jan Reedy
participants (16)
-
Alberto Berti
-
Chris Angelico
-
Daniel Moisset
-
David Mertz
-
Ed Kellett
-
Guido van Rossum
-
Jacco van Dorp
-
Joao S. O. Bueno
-
Rhodri James
-
Robert Roskam
-
Serhiy Storchaka
-
Steven D'Aprano
-
steven@rigetti.com
-
Stéfane Fermigier
-
Terry Reedy
-
Tim Peters