ML Style Pattern Matching for Python
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too. ML style pattern matching is syntactic sugar, that combines "if" statements with tuple unpacking, access to object attributes, and assignments. It is a compact, yet very readable syntax for algorithms, that would otherwise require nested "if" statements. It is especially useful for writing interpreters, and processing complex trees. Instead of a specification in BNF, here is a function written with the proposed pattern matching syntax. It demonstrates the features that I find most important. The comments and the print statements explain what is done. Proposed Syntax --------------- def foo(x): match x with | 1 -> # Equality print("x is equal to 1") | a:int -> # Type check print("x has type int: %s" % a) | (a, b) -> # Tuple unpacking print("x is a tuple with length 2: (%s, %s)" % (a, b)) | {| a, b |} -> # Attribute existence and access print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) # Additional condition | (a, b, c) with a > b -> print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second element.") # Complex case | {| c:int, d=1 |}:Foo -> print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d)) # Default case | _ -> print("x can be anything") Equivalent Current Python ------------------------- The first four cases could be handled more simply, but handling all cases in the same way leads IMHO to more simple code overall. def foo(x): while True: # Equality if x == 1: print("x is equal to 1") break # Type check if isinstance(x, int): a = x print("x is an integer: %s" % a) break # Tuple unpacking if isinstance(x, tuple) and len(x) == 2: a, b = x print("x is a tuple with length 2: (%s, %s)" % (a, b)) break # Attribute existence testing and access if hasattr(x, "a") and hasattr(x, "b"): a, b = x.a, x.b print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) break # Additional condition if isinstance(x, tuple) and len(x) == 3: a, b, c = x if a > b : print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second " "element.") break # Complex case if isinstance(x, Foo) and hasattr(x, "c") and hasattr(x, "d"): c, d = x.c, x.d if isinstance(c, int) and d == 1: print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d)) break # Default case print("x can be anything") break Additional Code to Run Function "foo" ------------------------------------- class Bar(object): def __init__(self, a, b): self.a = a self.b = b class Foo(object): def __init__(self, c, d): self.c = c self.d = d foo(1) # Equality foo(2) # Type check foo((1, 2)) # Tuple unpacking foo(Bar(1, 2)) # Attribute existence testing and access foo((2, 1, 3)) # Additional condition foo(Foo(2, 1)) # Complex case foo("hello") # Default case I left out dict and set, because I'm not sure how they should be handled. I think list should be handled like tuples. Probably there should be a universal matching syntax for all sequences, similarly to the already existing syntax: a, b, *c = s I don't really like the "->" digraph at the end of each match case. A colon would be much more consistent, but I use colons already for type checking (a:int). I generally think that Python should acquire more features from functional languages. In analogy to "RPython" it should ultimately lead to "MLPython", a subset of the Python language that can be type checked and reasoned about by external tools, similarly to what is possible with Ocaml. Eike.
Eike Welk wrote:
Instead of a specification in BNF, here is a function written with the proposed pattern matching syntax.
My BNF is weak -- thanks for the code!
Proposed Syntax ---------------
def foo(x): match x with | 1 -> # Equality print("x is equal to 1") | a:int -> # Type check print("x has type int: %s" % a) | (a, b) -> # Tuple unpacking print("x is a tuple with length 2: (%s, %s)" % (a, b)) | {| a, b |} -> # Attribute existence and access print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b))
# Additional condition | (a, b, c) with a > b -> print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second element.")
# Complex case | {| c:int, d=1 |}:Foo -> print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d))
# Default case | _ -> print("x can be anything")
I am unfamiliar with OCAML -- if x can match more than one condition, will it match all possible, or just the first one? If just the first one, the python code below can be simplified by ditching the while loop, removing the breaks, and using elif and else. def foo(x): # Equality if x == 1: print("x is equal to 1") # Type check elif isinstance(x, int): a = x print("x is an integer: %s" % a) # Tuple unpacking elif isinstance(x, tuple) and len(x) == 2: a, b = x print("x is a tuple with length 2: (%s, %s)" % (a, b)) # Attribute existence testing and access elif hasattr(x, "a") and hasattr(x, "b"): a, b = x.a, x.b print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) # Additional condition elif isinstance(x, tuple) and len(x) == 3: a, b, c = x if a > b : print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second " "element.") # Complex case elif isinstance(x, Foo) and hasattr(x, "c") and hasattr(x, "d"): c, d = x.c, x.d if isinstance(c, int) and d == 1: print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d)) # Default case else: print("x can be anything") One of the things I like about Python is it's readability. While the OCAML inspired version is much more concise, I don't see it as significantly easier to read -- and at this point I don't see any place where I would make use of it myself. Here's a bit of code that might be convertable: result = [] for i, piece in enumerate(pieces): if '-' in piece: piece = piece.replace('-',' ') piece = '-'.join(NameCase(piece).split()) elif alpha_num(piece) in ('i', 'ii', 'iii', 'iv', 'v', 'vi', \ 'vii', 'viii', 'ix', 'x', 'pc', 'llc') \ or piece.upper() in job_titles \ or i and piece.upper() in deg_suffixi: piece = piece.upper() elif piece in ('and', 'de', 'del', 'der', 'el', 'la', 'van', ): pass elif piece[:2] == 'mc': piece = 'Mc' + piece[2:].title() else: possible = mixed_case_names.get(piece, None) if possible is not None: piece = possible else: piece = piece.title() if piece[-2:].startswith("'"): piece = piece[:-1] + piece[-1].lower() result.append(piece) Ugly, I know -- but the question is: how would 'match' handle things like '-' in piece piece.upper() in .... piece in (....) piece[:2] = 'mc' in other words, would match save me much in this circumstance, and would it be easier to read? -1 ~Ethan~
On Saturday 18.12.2010 01:13:07 Ethan Furman wrote:
I am unfamiliar with OCAML -- if x can match more than one condition, will it match all possible, or just the first one? If just the first one, the python code below can be simplified by ditching the while loop, removing the breaks, and using elif and else.
The match statement should indeed match (and execute) only the first condition. However you can't simplify the code the way you do: If one of the inner "if" statements fails, the algorithm should try the next match case.
def foo(x): # Equality if x == 1: print("x is equal to 1")
# Type check elif isinstance(x, int): a = x print("x is an integer: %s" % a)
# Tuple unpacking elif isinstance(x, tuple) and len(x) == 2: a, b = x print("x is a tuple with length 2: (%s, %s)" % (a, b))
# Attribute existence testing and access elif hasattr(x, "a") and hasattr(x, "b"): a, b = x.a, x.b print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b))
# Additional condition elif isinstance(x, tuple) and len(x) == 3: a, b, c = x if a > b : print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second " "element.")
# Complex case elif isinstance(x, Foo) and hasattr(x, "c") and hasattr(x, "d"): c, d = x.c, x.d if isinstance(c, int) and d == 1: print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d))
# Default case else: print("x can be anything")
One of the things I like about Python is it's readability. While the OCAML inspired version is much more concise, I don't see it as significantly easier to read -- and at this point I don't see any place where I would make use of it myself.
Here's a bit of code that might be convertable:
result = [] for i, piece in enumerate(pieces): if '-' in piece: piece = piece.replace('-',' ') piece = '-'.join(NameCase(piece).split()) elif alpha_num(piece) in ('i', 'ii', 'iii', 'iv', 'v', 'vi', \ 'vii', 'viii', 'ix', 'x', 'pc', 'llc') \ or piece.upper() in job_titles \ or i and piece.upper() in deg_suffixi: piece = piece.upper() elif piece in ('and', 'de', 'del', 'der', 'el', 'la', 'van', ): pass elif piece[:2] == 'mc': piece = 'Mc' + piece[2:].title() else: possible = mixed_case_names.get(piece, None) if possible is not None: piece = possible else: piece = piece.title() if piece[-2:].startswith("'"): piece = piece[:-1] + piece[-1].lower() result.append(piece)
Ugly, I know -- but the question is: how would 'match' handle things like '-' in piece piece.upper() in .... piece in (....) piece[:2] = 'mc'
in other words, would match save me much in this circumstance, and would it be easier to read?
Your example can't be easily simplified with the match statement. The match statement isn't meant for string processing, but rather for nested tuples, or objects. Additionally I didn't yet think about set operations (for "in"). The strength of the match statement is taking complex objects apart, and testing some conditions on the pieces, at the same time. Eike.
On 12/17/2010 6:21 PM, Eike Welk wrote:
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
-1 on adding anything like this to core Python. In your example, it is 1. completely redundant with respect to current Python syntax; 2. look like chicken scratches; 3. limited by the chars availables to just a few of the infinity of tests one might want to run. 4. gives special prominence to tuples, which is hardly appropriate to list or iterable-oriented code. The third point is why Python does not try to everything with syntax symbols. How test that the input is a positive number?How test that the input is a positive number? Type testing is somewhat contrary to duck-typing. For instance, how would you test that the input is an iterable?
ML style pattern matching is syntactic sugar, that combines "if" statements with tuple unpacking, access to object attributes, and assignments. It is a compact, yet very readable syntax for algorithms, that would otherwise require nested "if" statements.
Your example does not use nesting.
It is especially useful for writing interpreters, and processing complex trees.
Sounds like better suited to a special-purpose 3rd party module, like pyparsing.
Instead of a specification in BNF, here is a function written with the proposed pattern matching syntax. It demonstrates the features that I find most important. The comments and the print statements explain what is done.
Proposed Syntax ---------------
def foo(x): match x with | 1 -> # Equality print("x is equal to 1") | a:int -> # Type check print("x has type int: %s" % a) | (a, b) -> # Tuple unpacking print("x is a tuple with length 2: (%s, %s)" % (a, b)) | {| a, b |} -> # Attribute existence and access print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b))
# Additional condition | (a, b, c) with a> b -> print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second element.")
# Complex case | {| c:int, d=1 |}:Foo -> print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d))
# Default case | _ -> print("x can be anything")
Equivalent Current Python -------------------------
The first four cases could be handled more simply, but handling all cases in the same way leads IMHO to more simple code overall.
def foo(x): while True: # Equality if x == 1: print("x is equal to 1") break
# Type check if isinstance(x, int): a = x print("x is an integer: %s" % a) break
# Tuple unpacking if isinstance(x, tuple) and len(x) == 2: a, b = x print("x is a tuple with length 2: (%s, %s)" % (a, b)) break
# Attribute existence testing and access if hasattr(x, "a") and hasattr(x, "b"): a, b = x.a, x.b print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) break
# Additional condition if isinstance(x, tuple) and len(x) == 3: a, b, c = x if a> b : print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second " "element.") break
# Complex case if isinstance(x, Foo) and hasattr(x, "c") and hasattr(x, "d"): c, d = x.c, x.d if isinstance(c, int) and d == 1: print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d)) break
# Default case print("x can be anything") break
Additional Code to Run Function "foo" -------------------------------------
class Bar(object): def __init__(self, a, b): self.a = a self.b = b
class Foo(object): def __init__(self, c, d): self.c = c self.d = d
foo(1) # Equality foo(2) # Type check foo((1, 2)) # Tuple unpacking foo(Bar(1, 2)) # Attribute existence testing and access foo((2, 1, 3)) # Additional condition foo(Foo(2, 1)) # Complex case foo("hello") # Default case
I left out dict and set, because I'm not sure how they should be handled. I think list should be handled like tuples. Probably there should be a universal matching syntax for all sequences, similarly to the already existing syntax: a, b, *c = s
I don't really like the "->" digraph at the end of each match case. A colon would be much more consistent, but I use colons already for type checking (a:int).
I generally think that Python should acquire more features from functional languages. In analogy to "RPython" it should ultimately lead to "MLPython", a subset of the Python language that can be type checked and reasoned about by external tools, similarly to what is possible with Ocaml.
Eike.
-- Terry Jan Reedy
It's already trivial to make a decorator to do pattern matching, and
if PJE ever finished generic functions, it will get trivial-er. You
just need to create a simple decorator to do something like this:
@basecase
def fac(n):
return n * fac(n -1)
@f.equals(1):
def f(n):
return 1
@f.lessthan(1)
def f(n):
raise Error("n less than 1, operation not defined.")
It's not that pretty, but I don't see any need for a whole new syntax
for what is basically a glorified elif.
On Fri, Dec 17, 2010 at 2:14 PM, Terry Reedy
On 12/17/2010 6:21 PM, Eike Welk wrote:
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
-1 on adding anything like this to core Python. In your example, it is 1. completely redundant with respect to current Python syntax; 2. look like chicken scratches; 3. limited by the chars availables to just a few of the infinity of tests one might want to run. 4. gives special prominence to tuples, which is hardly appropriate to list or iterable-oriented code.
The third point is why Python does not try to everything with syntax symbols. How test that the input is a positive number?How test that the input is a positive number? Type testing is somewhat contrary to duck-typing. For instance, how would you test that the input is an iterable?
ML style pattern matching is syntactic sugar, that combines "if" statements with tuple unpacking, access to object attributes, and assignments. It is a compact, yet very readable syntax for algorithms, that would otherwise require nested "if" statements.
Your example does not use nesting.
It is especially useful for writing interpreters, and
processing complex trees.
Sounds like better suited to a special-purpose 3rd party module, like pyparsing.
Instead of a specification in BNF, here is a function written with the proposed pattern matching syntax. It demonstrates the features that I find most important. The comments and the print statements explain what is done.
Proposed Syntax ---------------
def foo(x): match x with | 1 -> # Equality print("x is equal to 1") | a:int -> # Type check print("x has type int: %s" % a) | (a, b) -> # Tuple unpacking print("x is a tuple with length 2: (%s, %s)" % (a, b)) | {| a, b |} -> # Attribute existence and access print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b))
# Additional condition | (a, b, c) with a> b -> print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second element.")
# Complex case | {| c:int, d=1 |}:Foo -> print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d))
# Default case | _ -> print("x can be anything")
Equivalent Current Python -------------------------
The first four cases could be handled more simply, but handling all cases in the same way leads IMHO to more simple code overall.
def foo(x): while True: # Equality if x == 1: print("x is equal to 1") break
# Type check if isinstance(x, int): a = x print("x is an integer: %s" % a) break
# Tuple unpacking if isinstance(x, tuple) and len(x) == 2: a, b = x print("x is a tuple with length 2: (%s, %s)" % (a, b)) break
# Attribute existence testing and access if hasattr(x, "a") and hasattr(x, "b"): a, b = x.a, x.b print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) break
# Additional condition if isinstance(x, tuple) and len(x) == 3: a, b, c = x if a> b : print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second " "element.") break
# Complex case if isinstance(x, Foo) and hasattr(x, "c") and hasattr(x, "d"): c, d = x.c, x.d if isinstance(c, int) and d == 1: print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d)) break
# Default case print("x can be anything") break
Additional Code to Run Function "foo" -------------------------------------
class Bar(object): def __init__(self, a, b): self.a = a self.b = b
class Foo(object): def __init__(self, c, d): self.c = c self.d = d
foo(1) # Equality foo(2) # Type check foo((1, 2)) # Tuple unpacking foo(Bar(1, 2)) # Attribute existence testing and access foo((2, 1, 3)) # Additional condition foo(Foo(2, 1)) # Complex case foo("hello") # Default case
I left out dict and set, because I'm not sure how they should be handled. I think list should be handled like tuples. Probably there should be a universal matching syntax for all sequences, similarly to the already existing syntax: a, b, *c = s
I don't really like the "->" digraph at the end of each match case. A colon would be much more consistent, but I use colons already for type checking (a:int).
I generally think that Python should acquire more features from functional languages. In analogy to "RPython" it should ultimately lead to "MLPython", a subset of the Python language that can be type checked and reasoned about by external tools, similarly to what is possible with Ocaml.
Eike.
-- Terry Jan Reedy
_______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
On 12/17/2010 6:21 PM, Eike Welk wrote:
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
On Fri, Dec 17, 2010 at 4:14 PM, Terry Reedy
1. completely redundant with respect to current Python syntax; 2. look like chicken scratches; 3. limited by the chars availables to just a few of the infinity of tests one might want to run. 4. gives special prominence to tuples, which is hardly appropriate to list or iterable-oriented code.
I'm skeptical that anything is worth adding here, but here's an alternative. I'm only looking at the decomposition part. I'm using a matches keyword and "&" and "*". (The choice here doesn't matter but I have to use something for the examples, so please don't pick on that aspect.) x = (3, 4) if x matches (&a, &b): print(a, b) is equivalent to x = (3, 4) if isinstance(x, tuple) and len(x) == 2: a, b = x print(a, b) Or a bit more complicated: x = [3, 4] if x matches [3, &b, *]: print(b) is equivalent to x = [3, 4] if isinstance(x, list) and x[0] == 3: _, b, *_ = x print(b) and if x matches {j : &k}: print(k) means if isinstance(x, dict) and j in x and len(x) == 1: k = a[j] print(k) Here's a more complex example: x matches { 'account': (&username, &domain), 'type': 'administrator', 'password': &hash, 'friends': (&best_friend, *&other_friends), * } Whether or not this is useful at all is a big question, but I think it's at least more interesting. This isn't perfect. For example x matches Foo(a=&alpha, b=&beta) could mean checking x.hasattr('a') but there's no guarantee that Foo(a=1, b=2) will produce that result. Maybe that's OK. Also, any values on the right hand side not marked by & are matched for equality, so you could write x matches (3 * 14 + big_complicated_expression, &y) which is pretty ugly. And that suggests: x matches (f(&y), g(&z)) which either doesn't make sense or is messy. We could add a __matches__ attribute to functions so they could support this, but now this is getting pretty complicated for unclear benefit. --- Bruce Latest blog post: http://www.vroospeak.com/2010/11/enduring-joe-barton.html Learn about security: http://j.mp/gruyere-security
I would love to see this implemented! Off course I like my syntax better. :-) On Saturday 18.12.2010 22:26:05 Bruce Leban wrote:
x = (3, 4) if x matches (&a, &b): print(a, b)
is equivalent to
x = (3, 4) if isinstance(x, tuple) and len(x) == 2: a, b = x print(a, b)
...
Whether or not this is useful at all is a big question, but I think it's at least more interesting. This isn't perfect. For example
x matches Foo(a=&alpha, b=&beta)
For this case I prose that the class "Foo" should implement a special function "__rinit__" that breaks an instance apart, into components that can be matched. These components should be chosen so that they could be given to "__init__" as its arguments. "__rinit__" should return the following objects: * a tuple of objects, corresponding to the regular arguments of "__init__". * a tuple of strings that contains the argument names. * a dict of string:object that represents the keyword only attributes. A simple class would look like this: class Foo(object): def __init__(self, c, d): self.c = c self.d = d def __rinit__(self): return (self.c, self.d), ("c", "d"), {} The example from above: x matches Foo(c=&gamma, d=&delta) would work like this: if isinstance(x, Foo): vals, names, kwargs = x.__rinit__() gamma = vals[names.index("c")] delta = vals[names.index("d")] Positional arguments are also possible: x matches Foo(&gamma, &delta) would work like this: if isinstance(x, Foo): vals, names, kwargs = x.__rinit__() gamma = vals[0] delta = vals[0] A type check would be expressed like this: x matches Foo(&*_)
could mean checking x.hasattr('a') but there's no guarantee that Foo(a=1, b=2) will produce that result. Maybe that's OK.
Also, any values on the right hand side not marked by & are matched for equality, so you could write
x matches (3 * 14 + big_complicated_expression, &y)
It's really neat, but maybe too much visual clutter. It should be discussed if the following looks better: x matches (x, y) and x == 3 * 14 + big_complicated_expression I'm unsure about it.
which is pretty ugly. And that suggests:
x matches (f(&y), g(&z))
It can only work with the few functions that are bijective. Many algorithms destroy information and can therefore not be run backwards.
which either doesn't make sense or is messy. We could add a __matches__ attribute to functions so they could support this, but now this is getting pretty complicated for unclear benefit.
Eike
I would love to see this implemented! Off course I like my syntax better. :-)
well, I have to agree with Terry that your syntax looks like chicken scratches. Terseness is not a synonym for elegant. In this proposal, I'm using a terse syntax in two places where I think a less terse syntax would be too verbose. For references, I used & which mirrors the familiar C reference syntax, and for extra arguments, I used * which mirrors the use of
On Sat, Dec 18, 2010 at 4:40 PM, Eike Welk
On Saturday 18.12.2010 22:26:05 Bruce Leban wrote:
Whether or not this is useful at all is a big question, but I think it's at least more interesting. This isn't perfect. For example
x matches Foo(a=&alpha, b=&beta)
For this case I prose that the class "Foo" should implement a special function "__rinit__" that breaks an instance apart, into components that can be matched. These components should be chosen so that they could be given to "__init__" as its arguments.
"__rinit__" should return the following objects: * a tuple of objects, corresponding to the regular arguments of "__init__". * a tuple of strings that contains the argument names. * a dict of string:object that represents the keyword only attributes.
This doesn't work. Positional and keyword arguments are not mutually exclusive. Furthermore, it may be a significant amount of work to unconstruct an object just to find out if it has an attribute. And some classes aren't "unconstructable".
A type check would be expressed like this:
x matches Foo(&*_)
That would just be x matches Foo(*) There's no need to bind the results to something.
could mean checking x.hasattr('a') but there's no guarantee that Foo(a=1,
b=2) will produce that result. Maybe that's OK.
Also, any values on the right hand side not marked by & are matched for equality, so you could write
x matches (3 * 14 + big_complicated_expression, &y)
It's really neat, but maybe too much visual clutter. It should be discussed if the following looks better:
x matches (x, y) and x == 3 * 14 + big_complicated_expression
Um, this was a *bad* example, indicating how the feature can be misused. There's no need to "discuss" the alternative you give though. The matches expression could be combined just like any other expression. So you could write: x matches (&k, &y) and k == some_expression or k = some_expression x matches (k, &y) I'm unsure about it.
which is pretty ugly. And that suggests:
x matches (f(&y), g(&z))
It can only work with the few functions that are bijective. Many algorithms destroy information and can therefore not be run backwards.
Again, this was a *bad* example.
--- Bruce
Latest blog post: http://www.vroospeak.com/2010/11/enduring-joe-barton.html Learn about security: http://j.mp/gruyere-security
Eike Welk wrote:
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
I've heard of pattern matching in other languages, but never had the opportunity to play around with them. It seems to me though, that it's just an enhanced case or switch statement. Python already has had a proposal to add a case/switch to the language. Unfortunately, due to lack of consensus on functionality and syntax, and lack of any pressing need, it hasn't gone anywhere.
ML style pattern matching is syntactic sugar, that combines "if" statements with tuple unpacking, access to object attributes, and assignments. It is a compact, yet very readable syntax for algorithms, that would otherwise require nested "if" statements. It is especially useful for writing interpreters, and processing complex trees.
It doesn't seem very readable to me. It looks like it is heavy on "magic" characters... for example, what is the purpose of the leading | character and the trailing -> digraph in this, and how does (| |) imply attribute access given the rest of Python's syntax? match x with | {| a, b |} -> # Attribute existence and access print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b)) It would require a new keyword "match", and overloading an existing keyword "with" to have a second meaning. Python is *very* resistant to adding new keywords. You would need to demonstrate that pattern matching leads to significant benefits over if...elif...else in order to compensate for the pain you cause to those who use "match" as a variable name. This includes Python's own standard library, which has match functions and methods in the re module. It also uses special characters in a way that looks more like Perl than Python. With few exceptions -- iterable slicing comes to mind -- Python tends to avoid magic characters like that. If you are serious about pursuing this idea, I suggest you need to demonstrate some non-trivial gains from pattern matching. Perhaps you should also look at how Haskell does it, and how functions in Haskell can be written concisely. Here's a toy example: factorial :: Integer -> Integer factorial 0 = 1 factorial n = n * factorial (n - 1) I'm interested in pattern matching, but I think the suggested syntax is completely inappropriate for Python. I think that for this to have any hope, its best bet would be as an enhancement of the case/switch PEP, and it would need to look like Python code, not like Haskell or OCAML. -- Steven
Eike Welk wrote:
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
I don't object to the general idea, but syntax such as
| {| a, b |} -> # Attribute existence and access
is far too cryptic and does nothing to improve readability. -- Greg
On Saturday 18.12.2010 03:03:06 Greg Ewing wrote:
Eike Welk wrote:
| {| a, b |} -> # Attribute existence and access
is far too cryptic and does nothing to improve readability.
List comprehensions are also cryptic if you see them for the first time. The curly brackets with bars "{| |}" should symbolize that Python object are glorified dicts. It is also quite close to Ocaml's syntax for records. My secret agenda was however, to later introduce a new class and object syntax for Python. I just didn't dare to propose a completely revised Python. But here we go: :-) Class Creation -------------- Foo = class {| a, b |} This should be equivalent to: class Foo(object): def __init__(self, a, b): self.a = a self.b = b Inheritance is not so important in the context of classes that have no methods. It could be expressed with a new method ("inherits") of the metaclass. Like this: Foo = class {| a, b |}.inherits(Bar, Baz) How one would create methods, and if it should be possible to create method at all, needs to be discussed. Instance Creation ----------------- foo = {| a=1, b=2 |}:Foo This should be equivalent to: foo = Foo(a=1, b=2) In the usual case, it should not be necessary to specify the class name: foo = {| a=1, b=2 |} The run-time should search for a class with a matching "__init__" method. In case of ambiguities an exception would be raised. You have to name object attributes unambiguously for this to work. However it is more flexible than Ocaml because you can disambiguate it by specifying the class name. These constructions are expression, and can therefore be nested. The next nice idea from Ocaml would be extending the class system ... I'm getting a bit off topic though. I'll present a revised syntax for object access later; generalized constructors. Eike.
On Sat, 18 Dec 2010 12:23:45 +0100
Eike Welk
Class Creation --------------
Foo = class {| a, b |}
This should be equivalent to:
class Foo(object): def __init__(self, a, b): self.a = a self.b = b
Inheritance is not so important in the context of classes that have no methods. It could be expressed with a new method ("inherits") of the metaclass. Like this:
Foo = class {| a, b |}.inherits(Bar, Baz)
How one would create methods, and if it should be possible to create method at all, needs to be discussed.
What advantage?
Instance Creation -----------------
foo = {| a=1, b=2 |}:Foo
This should be equivalent to:
foo = Foo(a=1, b=2)
In the usual case, it should not be necessary to specify the class name:
foo = {| a=1, b=2 |}
I want composite object literal notation as well. But certainly not {| a=1, b=2 |}. Rather (a=1, b=2) or (a:1, b:2). Untyped case would created an instance of Object (but since as of now they can't have attrs, there should be another modif), or of a new FreeObject subtype of Object.
The run-time should search for a class with a matching "__init__" method.
? conflicts? Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
Eike Welk wrote:
My secret agenda was however, to later introduce a new class and object syntax for Python. I just didn't dare to propose a completely revised Python. But here we go: :-)
Class Creation --------------
Foo = class {| a, b |}
This should be equivalent to:
class Foo(object): def __init__(self, a, b): self.a = a self.b = b
Really? So Foobar = class {| 17, 'yellow' |} means class Foobar() def __init__(self, 17, 'yellow'): self.17 = 17 self.'yellow' = 'yellow' Looks like I'm either stuck with always using a, b, c, etc for attribute names, or I get compile errors. This is *definitely* not Pythonic -- offers nothing for readability, requires new strange symbols... yuck. If you want a one-liner, make a function: def Class(*args, **kwargs): obj = type('Simple', (object, ), dict()) for i, arg in enumerate(args): attr = chr(ord('a') + i) setattr(obj, attr, arg) for kw in kwargs: setattr(obj, kw, kwargs[kw]) return obj then Foobar = Class( 17, 'yellow', this='that' ) and season to taste. ~Ethan~
Am 18.12.2010 12:23, schrieb Eike Welk:
Instance Creation -----------------
foo = {| a=1, b=2 |}:Foo
This should be equivalent to:
foo = Foo(a=1, b=2)
In the usual case, it should not be necessary to specify the class name:
foo = {| a=1, b=2 |}
The run-time should search for a class with a matching "__init__" method.
I'm beginning to suspect this is an elaborate trolling attempt...
On Saturday 18.12.2010 18:34:57 Georg Brandl wrote:
Am 18.12.2010 12:23, schrieb Eike Welk:
Instance Creation -----------------
foo = {| a=1, b=2 |}:Foo
This should be equivalent to:
foo = Foo(a=1, b=2)
In the usual case, it should not be necessary to specify the class name:
foo = {| a=1, b=2 |}
The run-time should search for a class with a matching "__init__" method.
I'm beginning to suspect this is an elaborate trolling attempt...
Yes, it's a bit of fun. This would never get into Python anyway, but I'd like it nevertheless. (It is Ocaml's record syntax with some small tweaks.) It would be possible to realize it though. With "inspect.getfullargspec" you can see the a function's signature. There might be problems with efficiency, because you'd have to search through globals and locals repeatedly. You you could however store which class matched, somewhere in the code object. Storing this kind of information would go against Python's nature as a dynamic language, even though it would work as expected in 99% of the use cases. Oh well ... Eike
Am 18.12.2010 12:23, schrieb Eike Welk:
In... Yes, it's a bit of fun. This would never get into Python anyway, but I'd
I don't have any real opinion on this, but this old experiment seems related
to the discussion: http://svn.colorstudy.com/home/ianb/recipes/patmatch.py
--
Sent from a phone
On Dec 18, 2010 12:30 PM, "Eike Welk"
Eike Welk wrote:
On Saturday 18.12.2010 03:03:06 Greg Ewing wrote:
Eike Welk wrote:
| {| a, b |} -> # Attribute existence and access
is far too cryptic and does nothing to improve readability.
List comprehensions are also cryptic if you see them for the first time. The curly brackets with bars "{| |}" should symbolize that Python object are glorified dicts. It is also quite close to Ocaml's syntax for records.
Not really. A list comp looks like a cross between a list and a for-loop, which is exactly what a list comp is. [2*x for x in sequence] I believe that people -- at least some people -- could intuit the meaning of this from context. But even if they can't, it's a simple extension to existing Python syntax they should already know. And best of all, you can easily experiment on it by copying and pasting a list comp into the interactive interpreter and seeing what it does. Your proposed syntax {| |} could be related to sets, or dicts, or both. Or it could be related to the | operator. If you don't know which, it's hard to guess. It looks more like a set than a dict: {a, b} # set {a: value, b: value} # dict {|a, b|} # looks like a set with extra symbols and there's probably nothing you can do with it in isolation from a full pattern match block.
My secret agenda was however, to later introduce a new class and object syntax for Python. I just didn't dare to propose a completely revised Python. But here we go: :-)
Class Creation --------------
Foo = class {| a, b |}
This should be equivalent to:
class Foo(object): def __init__(self, a, b): self.a = a self.b = b
What's so special about Foo that it needs special syntax just to make it easier? In any case, these days I'd suggest that's probably best written as a namedtuple:
from collections import namedtuple Spam = namedtuple('Spam', 'a b c') x = Spam(a=23, b=42, c=None) x Spam(a=23, b=42, c=None)
Inheritance is not so important in the context of classes that have no methods. It could be expressed with a new method ("inherits") of the metaclass. Like this:
Foo = class {| a, b |}.inherits(Bar, Baz)
How one would create methods, and if it should be possible to create method at all, needs to be discussed.
I don't think it does :) I think you're falling into the trap of thinking that everything needs to be a one-liner. It doesn't.
Instance Creation -----------------
foo = {| a=1, b=2 |}:Foo
This should be equivalent to:
foo = Foo(a=1, b=2)
Why do you think you need two ways of spelling Foo(a=1, b=2)? What's wrong with the Python syntax for it? It seems to me that you want the obfuscation of OCAML's syntax with the slowness of Python, a strange choice...
In the usual case, it should not be necessary to specify the class name:
foo = {| a=1, b=2 |}
The run-time should search for a class with a matching "__init__" method. In
Oh wow. Just ... wow. You want the Python virtual machine to do a search through N objects in K scopes, checking each one to see if it is a class, then checking the signature of the __init__ method, just to allow the caller to *implicitly* specify the class of an instance instead of explicitly. -1000 on that.
case of ambiguities an exception would be raised. You have to name object attributes unambiguously for this to work. However it is more flexible than Ocaml because you can disambiguate it by specifying the class name.
These constructions are expression, and can therefore be nested.
*cries* -- Steven
On Sun, 19 Dec 2010 12:03:43 +1100
Steven D'Aprano
List comprehensions are also cryptic if you see them for the first time. The curly brackets with bars "{| |}" should symbolize that Python object are glorified dicts. It is also quite close to Ocaml's syntax for records.
Not really. A list comp looks like a cross between a list and a for-loop, which is exactly what a list comp is.
And actually python (composite) objects, conceptually, are closer to named tuples than to dicts; composite objects are _not_ collections. That they are based on dicts is, imo, an implementation detail. For this reason, I'm pleased their dict attr is weirdly called "__dict__". Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
On Sunday 19.12.2010 02:03:43 Steven D'Aprano wrote:
Eike Welk wrote: Your proposed syntax {| |} could be related to sets, or dicts, or both. Or it could be related to the | operator. If you don't know which, it's hard to guess. It looks more like a set than a dict:
{a, b} # set {a: value, b: value} # dict {|a, b|} # looks like a set with extra symbols
and there's probably nothing you can do with it in isolation from a full pattern match block.
I like this argument. This might lead to a design decision for the pattern matching syntax: A pattern that matches an object should look closely like the code that creates that object.
Class Creation --------------
Foo = class {| a, b |}
What's so special about Foo that it needs special syntax just to make it easier? In any case, these days I'd suggest that's probably best written
as a namedtuple:
from collections import namedtuple Spam = namedtuple('Spam', 'a b c') x = Spam(a=23, b=42, c=None) x
Spam(a=23, b=42, c=None)
Yes you are right. namedtuple is rely a perfect substitute for for Ocaml's records. The only thing which it is missing, is optional type checking in the constructor. Not only as a consistency test, but also as a terse form of documentation.
Inheritance is not so important in the context of classes that have no methods. It could be expressed with a new method ("inherits") of the metaclass. Like this:
Foo = class {| a, b |}.inherits(Bar, Baz)
How one would create methods, and if it should be possible to create method at all, needs to be discussed.
I don't think it does :)
I think you're falling into the trap of thinking that everything needs to be a one-liner. It doesn't.
Yes, in a "real" program you would want to document the attributes, and soon the class definition would be no longer a one liner. On the other hand: If the code is short (and readable), you can get an overview much more easily. My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has. This information is hidden in __init__, and sometimes elsewhere. I think a mechanism like slots should be the norm, and dynamism the exception.
Instance Creation -----------------
foo = {| a=1, b=2 |}:Foo
This should be equivalent to:
foo = Foo(a=1, b=2)
Why do you think you need two ways of spelling Foo(a=1, b=2)? What's wrong with the Python syntax for it? It seems to me that you want the obfuscation of OCAML's syntax with the slowness of Python, a strange choice...
I would like to combine the nice aspects of Ocaml, with the unproblematic nature of Python. In Ocaml I like the syntax for records, the match expression, and some aspects of the file system. Most other things are IMHO rather horrible. Eike.
On Sun, 19 Dec 2010 19:52:28 +0100
Eike Welk
My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has. This information is hidden in __init__, and sometimes elsewhere.
Agreed. Rather commonly elsewhere, I guess.
I think a mechanism like slots should be the norm, and dynamism the exception.
I wish we could put in front place the set of intended data attributes, including ones w/o defaults and optional ones. Even better, instanciation with the same param names would automagically set those attributes. class Point(Object): x = 0 y = 0 d color def __init__(self, HLSColor=None): if color is not None: self.color = toRGB(color) self.d = self.x + self.y p = Point(x=1, y=2, color=HLS(0,0,50) assert (p.d == 3) (untested) I find those loads of "self.x=x" in constructors sooo stupid --I want the machine to do it for me. __init__ should only define the essential part of obj construction; while the final constructor would do some mechanical job in addition. Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
On Mon, Dec 20, 2010 at 6:44 AM, spir
On Sun, 19 Dec 2010 19:52:28 +0100 Eike Welk
wrote: My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has. This information is hidden in __init__, and sometimes elsewhere.
Agreed. Rather commonly elsewhere, I guess.
I think a mechanism like slots should be the norm, and dynamism the exception.
I wish we could put in front place the set of intended data attributes, including ones w/o defaults and optional ones. Even better, instanciation with the same param names would automagically set those attributes.
These days, a nice solution to that problem is to define a named tuple and inherit from it (see the 3.2 version of urllib.parse for a number of examples). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Dec 19, 2010 at 9:44 PM, spir
I find those loads of "self.x=x" in constructors sooo stupid --I want the machine to do it for me. __init__ should only define the essential part of obj construction; while the final constructor would do some mechanical job in addition.
Automating that is quite easy with keyword arguments:
class Foo(object): ... def __init__(self, **kwargs): ... self.__dict__.update(kwargs) ... f = Foo(a=1, b=2) f.a 1 f.b 2
If you want to play safe, filter out keys that start with '__'. Best regards and a happy new year! Mart Sõmermaa
Eike Welk wrote:
My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has.
What's wrong with dir(obj) and vars(obj)?
class Spam: ... x = 1 ... def __init__(self): ... self.y = 2 ... obj = Spam() vars(obj) {'y': 2} dir(obj) ['__doc__', '__init__', '__module__', 'x', 'y']
Python has *awesome* self-inspection abilities -- there's very little you can't easily find out about an object, so much so that people sometimes complain that you can't really hide information from the caller in Python -- there are no truly private attributes, only private by convention. See also the inspect module.
This information is hidden in __init__, and sometimes elsewhere. I think a mechanism like slots should be the norm, and dynamism the exception.
Such a language would no longer be Python. Perhaps it will be a better language, perhaps a worse one, but it won't be Python. By the way, __slots__ is intended as a memory optimization, not as a mechanism for defeating Python's dynamic nature. -- Steven
On 19 Dec 2010, at 21:26, Steven D'Aprano
Eike Welk wrote:
My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has.
What's wrong with dir(obj) and vars(obj)?
class Spam: ... x = 1 ... def __init__(self): ... self.y = 2 ... obj = Spam() vars(obj) {'y': 2} dir(obj) ['__doc__', '__init__', '__module__', 'x', 'y']
I think the issue that Elke is pointing out is that dir(Spam) knows nothing about y (whereas it would if you used __slots__). One answer is to have class member defaults for all instance members. Michael
Python has *awesome* self-inspection abilities -- there's very little you can't easily find out about an object, so much so that people sometimes complain that you can't really hide information from the caller in Python -- there are no truly private attributes, only private by convention. See also the inspect module.
This information is hidden in __init__, and sometimes elsewhere. I think a mechanism like slots should be the norm, and dynamism the exception.
Such a language would no longer be Python. Perhaps it will be a better language, perhaps a worse one, but it won't be Python.
By the way, __slots__ is intended as a memory optimization, not as a mechanism for defeating Python's dynamic nature.
-- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
Michael wrote:
On 19 Dec 2010, at 21:26, Steven D'Aprano
wrote: Eike Welk wrote:
My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has. What's wrong with dir(obj) and vars(obj)?
[...]
I think the issue that Elke is pointing out is that dir(Spam) knows nothing about y (whereas it would if you used __slots__).
Then he should have said -- he explicitly said: "You can't easily see which data attributes an INSTANCE has." [emphasis added] As a general rule, you can't expect the class to know what attributes an instance has. This is a deliberate design choice. If you want to find out what attributes an object has, you ask the object, not the object's parent class. Think of this as a variation on the Law of Demeter: if you want to know how many fleas a dog has, inspect the dog, not the dog's owner.
One answer is to have class member defaults for all instance members.
Well, that's an answer, but I'm not sure what the question is. From time to time I create a class which includes default class attributes, which then get optionally overridden by instance attributes. But that's much rarer than the usual case, where the instance has the attributes and the class doesn't. I would find it very disturbing to see classes declare meaningless class attributes (probably set to None) just to make it easier to predict what attributes the instances will get. That's pretty close to this horror: def spam(): # Declare all local variables which will be used. a = None b = None c = None # Now use them. a = 1 b = function(a, 23, -2) c = another_func(b) return a+b+c Personally, I've never found this to be a problem in practice. If I want to find out what attributes an instance of a class will have, I instantiate the class and inspect the instance. Or I read the docs. Looking at the source code is also an option (if the source code is available). For most classes, any attributes will be set in the __init__ method. If you have to read the source to determine what attributes exist, there's not much difference between: class Spam: a = 1 b = 2 c = 3 and class Spam: def __init(self): self.a = 1 self.b = 2 self.c = 3 But normally I just care about methods, not data attributes, and for that dir(cls) is perfectly adequate. (There may be the odd object that includes instance methods, but they'll be rare.) -- Steven
On 20 December 2010 10:08, Steven D'Aprano
Michael wrote:
On 19 Dec 2010, at 21:26, Steven D'Aprano
wrote: Eike Welk wrote:
My positive attitude towards this syntax comes from the only weakness
that Python IMHO has: You can't easily see which data attributes an instance has.
What's wrong with dir(obj) and vars(obj)?
[...]
I think the issue that Elke is pointing out is that dir(Spam) knows
nothing about y (whereas it would if you used __slots__).
Then he should have said -- he explicitly said:
"You can't easily see which data attributes an INSTANCE has." [emphasis added]
Well sure - just by looking at the class you can't tell what members an INSTANCE has... :-)
As a general rule, you can't expect the class to know what attributes an instance has. This is a deliberate design choice. If you want to find out what attributes an object has, you ask the object, not the object's parent class.
Think of this as a variation on the Law of Demeter: if you want to know how many fleas a dog has, inspect the dog, not the dog's owner.
One answer is to have class member defaults for all instance members.
Well, that's an answer, but I'm not sure what the question is. From time to time I create a class which includes default class attributes, which then get optionally overridden by instance attributes. But that's much rarer than the usual case, where the instance has the attributes and the class doesn't.
I would find it very disturbing to see classes declare meaningless class attributes (probably set to None) just to make it easier to predict what attributes the instances will get. That's pretty close to this horror: [snip...]
Personally, I've never found this to be a problem in practice.
The only place I've found it relevant is when mocking out objects that you don't want instantiated in your test. mock.Mock (and mock.patch) can create mock objects that behave like the original (in terms of available methods and members) if you pass it a reference to the object it is mocking. If member creation is 'hidden' inside __init__ then Mock can't know what members are available.
If I want to find out what attributes an instance of a class will have, I instantiate the class and inspect the instance. Or I read the docs. Looking at the source code is also an option (if the source code is available). For most classes, any attributes will be set in the __init__ method. If you have to read the source to determine what attributes exist, there's not much difference between:
Well, if there's anything you *can't* tell about the behaviour of objects by reading the source code then something odd is going on. ;-) Programmatic inspection is where it is relevant (to me at least). Michael
class Spam: a = 1 b = 2 c = 3
and
class Spam: def __init(self): self.a = 1 self.b = 2 self.c = 3
But normally I just care about methods, not data attributes, and for that dir(cls) is perfectly adequate. (There may be the odd object that includes instance methods, but they'll be rare.)
-- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org http://mail.python.org/mailman/listinfo/python-ideas
-- http://www.voidspace.org.uk/ May you do good and not evil May you find forgiveness for yourself and forgive others May you share freely, never taking more than you give. -- the sqlite blessing http://www.sqlite.org/different.html
On Sunday 19.12.2010 22:26:26 Steven D'Aprano wrote:
Eike Welk wrote:
My positive attitude towards this syntax comes from the only weakness that Python IMHO has: You can't easily see which data attributes an instance has.
What's wrong with dir(obj) and vars(obj)?
I should have expressed it more clearly, I was referring to the source text. In the declaration of a class in C++ you can see the data attributes that its instances use. In a Python class you have to look at the "__init__" method, and possibly into other methods, to see which data attributes are used. It's much harder to get an overview over the data attributes in Python. Python's introspection features are off course great. Eike P.S.: In my last mail, I also wanted to express that "I like Ocaml's type system", not its "file system".
Rather than such a large, monolithic construct, perhaps consider pursuing smaller tweaks that could improve the following. def _check_call(f, x): try: return f(*x) except TypeError: return False if x ==1: pass elif isinstance(x, int): pass elif isinstance(x, tuple) and len(x) == 2: pass elif hasattr(x, 'a') and hasattr(x, 'b'): pass elif _check_call((lambda a, b: a > b), x): pass elif isinstance(x, Foo) and isinstance(getattr(x, 'c', None), int) and getattr(x, 'd', None) == 1: pass else: pass Cheers. Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sat, 18 Dec 2010 00:21:35 +0100
Eike Welk
After learning a bit of Ocaml I started to like its pattern matching features. Since then I want to have a "match" statement in Python. I wonder if anybody else would like this too.
ML style pattern matching is syntactic sugar, that combines "if" statements with tuple unpacking, access to object attributes, and assignments. It is a compact, yet very readable syntax for algorithms, that would otherwise require nested "if" statements. It is especially useful for writing interpreters, and processing complex trees.
While I generally like pattern matching in FP languages, I think it is not appropriate and not necessary for a general application language like Python. It is an important feature in FP because (1) it matches (!) FP's general point of view or paradigm (2) FP language example application are very much about algorithmics --far less about general designing/modelling.
Instead of a specification in BNF, here is a function written with the proposed pattern matching syntax. It demonstrates the features that I find most important. The comments and the print statements explain what is done.
If ever pattern matching would be considered for inclusion in Python, it would have to fit Python's style, which your proposed syntax really does not. Actually, I find your current-python translation far more readable (except for the enclosing while true hack). The syntax should look switch/case, probably. A keyword seems necessary to tell apart ordinary switch conditions from pattern matching ones. But it cannot be "match", so let's say "matching". Other changes inline below:
Proposed Syntax ---------------
def foo(x):
match x with
matching x:
| 1 -> # Equality
case 1:
print("x is equal to 1") | a:int -> # Type check
case int: # <=> isinstance(x, int) or x.__type__==int ???
print("x has type int: %s" % a) | (a, b) -> # Tuple unpacking
case (a, b):
print("x is a tuple with length 2: (%s, %s)" % (a, b)) | {| a, b |} -> # Attribute existence and access
case (.a, .b):
print("x is an object with attributes 'a' and 'b'.") print("a=%s, b=%s" % (a, b))
# Additional condition | (a, b, c) with a > b ->
case (.a, .b, .c) and a > b:
print("x is a tuple with length 3: (%s, %s, %s)" % (a, b, c)) print("The first element is greater than the second element.")
# Complex case | {| c:int, d=1 |}:Foo ->
case Foo(.c, .d) and isinstance(c,int) and d == 1: case Foo(.c, .d) and c.__type__ == int and d == 1:
print("x has type Foo") print("x is an object with attributes 'c' and 'd'.") print("'c' has type 'int', 'd' is equal to 1.") print("c=%s, d=%s" % (c, d))
# Default case | _ ->
else:
print("x can be anything")
Equivalent Current Python -------------------------
The first four cases could be handled more simply, but handling all cases in the same way leads IMHO to more simple code overall.
[...]
Seems the only annoying aspect in current python is expressing secondary conditions. For instance, if x can be int and have either .a or .b, with different actions in sub-cases. Expressing this "naturally" in python is wrong because if x has neither .a nore .b other cases won't be matched. The only workaround is to have several top-level cases, repeating the top condition about x beeing an int. I guess.
I left out dict and set, because I'm not sure how they should be handled. I think list should be handled like tuples. Probably there should be a universal matching syntax for all sequences, similarly to the already existing syntax: a, b, *c = s
This cannot be! Even less lists. Proposal: case [] # check list case [n] # ditto; n must be int telling length case {a, b, c} # check set / elements case {a:, b:, c:} # check dict / keys An annoying thing is python has no syntax for structured objects. I find acceptable the above used workaround: (.a, .b, .c) to tell about attributes and T(.a, .b, .c) to additionally check the type.
I don't really like the "->" digraph at the end of each match case. A colon would be much more consistent, but I use colons already for type checking (a:int).
Does a:int means isinstance(a,int) or a.__type__==int ? I would like python to have an 'isa' operator for the latter.
I generally think that Python should acquire more features from functional languages. In analogy to "RPython" it should ultimately lead to "MLPython", a subset of the Python language that can be type checked and reasoned about by external tools, similarly to what is possible with Ocaml.
I doubt about that for a handful of reasons. ref to related feature, FWIW: OMeta general structural pattern matching http://tinlizzie.org/ometa/ (with implementation for python) Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
Great that you thought about a syntax for dicts and sets. One other point that must be solved, is how to distinguish symbols that should be assigned, from symbols that are specify conditions. Ocamls solution is that all symbols are assigned. Conditions must be specified with literals. Here is an example to illustrate it: This is a simple function that compares its argument "x" to the integer one: def foo(x): match x with | 1 -> print("x is equal to 1") | _ -> print("x is not equal to 1") In the code below we want to parametrize the value that we use for comparison. The parameter "a" is introduced for this purpose. Now we need a way to tell the compiler that "x" should be compared to "a"; and not assigned to "a". In Ocaml this is impossible. The compiler would issue a warning ("Redundant case in a pattern matching."). def foo(x, a): match x with | a -> print("x is equal to %s" % a) | _ -> print("x is not equal to %s" % a) Maybe you come up with a nice notation for this problem. By the way, why does no one like those "|" characters? They look good and take up few space! Eike.
On Sat, 18 Dec 2010 14:31:12 +0100
Eike Welk
Great that you thought about a syntax for dicts and sets.
One other point that must be solved, is how to distinguish symbols that should be assigned, from symbols that are specify conditions. Ocamls solution is that all symbols are assigned. Conditions must be specified with literals. Here is an example to illustrate it:
This is a simple function that compares its argument "x" to the integer one:
def foo(x): match x with | 1 -> print("x is equal to 1") | _ -> print("x is not equal to 1")
In the code below we want to parametrize the value that we use for comparison. The parameter "a" is introduced for this purpose. Now we need a way to tell the compiler that "x" should be compared to "a"; and not assigned to "a". In Ocaml this is impossible. The compiler would issue a warning ("Redundant case in a pattern matching.").
def foo(x, a): match x with | a -> print("x is equal to %s" % a) | _ -> print("x is not equal to %s" % a)
Maybe you come up with a nice notation for this problem.
No, I would do it like in OCaml, I think (but may be wrong) it fits the python-way, like having no formally qualified readonly or private slots.
By the way, why does no one like those "|" characters? They look good and take up few space!
;-) It does not fit Python overall syntax look. And requires a magic id '_' for default case. And requires adding '->'. And we already have 'case x:' (and 'else') which fit perfectly. And are far more readable (no need to guess what given symbols mean, including the 'sense' of an arrow ;-).
Eike.
Denis -- -- -- -- -- -- -- vit esse estrany ☣ spir.wikidot.com
participants (14)
-
Bruce Leban
-
Carl M. Johnson
-
Eike Welk
-
Ethan Furman
-
Georg Brandl
-
Greg Ewing
-
Ian Bicking
-
Mart Sõmermaa
-
Michael
-
Michael Foord
-
Nick Coghlan
-
spir
-
Steven D'Aprano
-
Terry Reedy