How do you think about these language extensions?

Hi all, I've just finished a language extension for CPython 3.6.x to support some additional grammars like Pattern Matching. And It's compatible with CPython. I'm looking for constructive advice, and I wonder if you will be interested in this one. ( the project address is https://github.com/thautwarm/flowpython) [https://avatars1.githubusercontent.com/u/22536460?v=4&s=400]<https://github.com/thautwarm/flowpython> thautwarm/flowpython<https://github.com/thautwarm/flowpython> github.com flowpython - tasty feature extensions for python(python3). Some examples here: # where syntax from math import pi r = 1 # the radius h = 10 # the height S = (2*S_top + S_side) where: S_top = pi*r**2 S_side = C * h where: C = 2*pi*r # lambda&curry : lambda x: lambda y: lambda z: ret where: ret = x+y ret -= z .x -> .y -> .z -> ret where: ret = x+y ret -= z as-with x def as y def as z def ret where: ret = x+y ret -= z # arrow transform (to avoid endless parentheses and try to be more readable.
range(5) -> map(.x->x+2, _) -> list(_) [2,3,4,5,6]
# pattern matching # use "condic" as keyword is for avoiding the conflictions against the standard libraries and packages from third party. "switch" and "match" both lead to conflictions. condic+(type) 1: case a:int => assert a == 1 and type(a) == 1 [>] case 0 => assert 1 > 0 [is not] case 1 => assert 1 is not 1 otherwise => print("nothing") condic+() [1,2,3]: case (a,*b)->b:list => sum(b) +[] case [] => print('empty list') +[==] case (a,b):(1,2) => print("the list is [1,2]") The grammars with more details and examples can be found in https://github.com/thautwarm/flowpython/wiki Does it interest you? If so, you can try it if you have CPython 3.6.x. pip install flowpython python -m flowpython -m enable/disable Here is an example to use flowpython, which gives the permutations of a sequence. from copy import deepcopy permutations = .seq -> seq_seq where: condic+[] seq: case (a, ) => seq_seq = [a,] case (a, b) => seq_seq = [[a,b],[b,a]] case (a,*b) => seq_seq = permutations(b) -> map(.x -> insertAll(x, a), _) -> sum(_, []) where: insertAll = . x, a -> ret where: ret = [ deepcopy(x) -> _.insert(i, a) or _ for i in (len(x) -> range(_+1)) ] If the object permutations are defined, try these codes in console:
range(3) -> permutations(_) [[0, 1, 2], [1, 0, 2], [1, 2, 0], [0, 2, 1], [2, 0, 1], [2, 1, 0]]
Does it seem to be interesting? Thanks, Thautwarm

Hello Thautwarm, and welcome! Sorry for the delay in responding, but this has been a very busy week for me personally, and an even busier week for my inbox, and so I missed your post until now. On Sun, Aug 13, 2017 at 12:49:45PM +0000, 王宣 ? wrote:
It is really good to see some actual practical experiments for these features, rather than just talking about them. Thank you! [...]
This has been suggested a few times. The first time, I disliked it, but I've come across to seeing its value. I like it. I wonder: could we make the "where" clause delay evaluation until the entire block was compiled, so that we could write something like this: S = (2*S_top + S_side) where: S_top = pi*r**2 S_side = C * h # C is defined further on C = 2*pi*r That's more how "where" is used mathematically.
I'm afraid I can't make heads or tails of that. Apart from guessing that it creates a function, I have no idea what it would do.
I like the idea of chained function calls, like pipes in shell languages such as bash. I've written a proof-of-concept for that: http://code.activestate.com/recipes/580625-collection-pipeline-in-python/ I prefer | to -> but that's just a personal preference. I don't like the use of _ in there. Underscore already has a number of special meanings, such as: - a convention for "don't care" - in the interactive interpreter, the last value calculated - used for internationalisation I don't think that giving _ yet another special meaning, and this one built in to the language, is a good idea.
This is a hard problem to deal with, but "condic" sounds awful. What is is supposed to mean? Short for "condition"?
I don't know how to read those. [...]
I find that almost unreadable. Too many new features all at once, it's like trying to read a completely unfamiliar language. How would you translate that into regular Python? Thanks for your experiments! -- Steve

On Fri, Aug 18, 2017 at 10:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
AIUI it's not a new meaning, but another variant of the second of those examples: it means "the last value calculated". However, I'd prefer to see it done with something that's otherwise illegal syntax - so unless the expression is to the right of a "->", you cannot use that symbol in that way. I'm on the fence as to whether it'd be better to allow an implicit last argument (or implicit first argument), so you can say "-> list()" without the symbol. ChrisA

# arrow transform (to avoid endless parentheses and try to be more readable.
parentheses aren't that bad, and as far as I can tell, this is just another way to call a function on the results of a function. The above is now spelled: list(map(lambda x: x+2, range(5))) which seems fine with me -- the only improvement I see is a more compact way to spell lambda. (though really, a list comp is considered more "pythonic" these days, yes? [x+2 for x in range(5)] nicely, we have list comps and generator expressions, so we can avoid the list0 call. I know this was a simple example for demonstration's sake, but doesn't look like an improvement to me. Of course, in this case, it's chaining iterations, not "ordinary" functions, so maybe would make more sense in other contexts. Also, we need to remember that functions can take *args, **kwargs, etc, and can return a tuple of just about anything -- not sure how well that maps to the "pipe" model. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Fri, Aug 18, 2017 at 11:47:40AM -0700, Chris Barker wrote:
I wouldn't say that parens are evil, but they're pretty noisy and distracting. I remember an old joke that claimed to prove that the US Defence Department was using Lisp for the SDI ("Star Wars") software: somebody had found a page covered completely edge to edge in nothing but closing brackets: )))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) ))))))))))))))))))))) ... etc Your example has a fairly short pipeline of calls:
list(map(lambda x: x+2, range(5)))
But even this has two clear problems: - the trailing brackets ))) are just noise, like the SDI joke above; - you have to read it backwards, right to left, to make sense of it. Imagine if you had a chain of ten or twenty calls: )))))))))) ... you get the picture But ultimately that's a relatively minor nuisance rather than a major problem. The thing that makes long chains of function calls painful is that you have to read them backwards: - first range() is called; - then map; - finally list even though we write them in the opposite order. When we reason about the code, say to write it in the first place, or to read the expression and understand it, I would guess that most people reason something like this: - start with our input data, range() - call map on it to generate new values; - call list to generate a list. When writing code like this, I frequently find myself having to work backwards compared to how we write the order of function calls: range(5) # move editor insertion point backwards map(...) # move editor insertion point backwards list(...) Half of my key presses are moving backwards over code I've just written to insert a function call which is executed *after* what I wrote, but needs to be written *before* what I just wrote. For a short example like this, where we can easily keep the three function calls in short-term memory, it isn't so bad, but short-term memory is very limited ("magic number seven, plus or minus two") and if you're already thinking about a couple of previous operations on earlier lines of code, you don't have a lot of stack space left for a long chain of operations. And that's why we often fall back to temporary variables and an imperative style: data = range(5) data = map(..., data) data = list(data) Perhaps not in such a short example, but for longer ones, very frequently. We can write the code in the same order that it is executed with a pipeline and avoid needing to push functions into our short-term memory when either reading or writing: range(5) -> map(lambda...) -> list This way of thinking combines the stengths of postfix notation and function call notation, without the disadvantages of either. This is very successful in shell scripting languages like bash. I don't want to oversell it as a panacea that solves everything, but it really is a powerful (and underused) software paradigm.
Aye, for such a sort example. But consider a longer one: find the earliest date in a bunch of lines of text: result = (myfile.readlines() -> map(str.strip) -> filter( lambda s: not s.startwith('#') ) -> sorted -> collapse # collapse runs of identical lines -> extract_dates -> map(date_to_seconds) -> min ) (I've assumed that the functions map and filter have some sort of automatic currying, like in Haskell; if you don't like that, then just pretend I spelled them Map and Filter instead :-) That's nice and easy to read and write: I wrote down exactly the steps I would have taken to solve the problem, in the same order that they need to be taken. Formatting is a breeze: the hardest decision was how far to indent subsequent lines. Compare it to this: result = min(map(date_to_seconds, extract_dates(collapse(sorted( filter(lambda s: not s.startswith('#'), map(str.strip, myfile.readlines()))))))) You have to read all the way to the end to find out the most important part, namely what data you are operating on! And then you have to read backwards to understand what is done to the data. And finally you have to be prepared for a whole lot of arguments from your co-workers about how to format it :-) # Either the ugliest thing ever, or the One True Way result = min( map( date_to_seconds, extract_dates( collapse( sorted( filter( lambda s: not s.startswith('#'), map( str.strip, myfile.readlines() ) ) ) ) ) ) ) [...]
Not everything maps well to the function pipeline model. But enough things do that I believe it is a powerful tool in the programmers toolkit. -- Steve

This is pretty easy to write without any syntax changes, just using a higher-order function `compose()` (possible implementation at foot). Again, I'll assume auto-currying like the map/filter versions of those functions in toolz, as Steven does:
result = compose(map(str.strip), filter(lambda s: not startswith('#'), sorted, collapse, extract_dates, map(date_to_seconds), min )(myfile.readlines()) Pretty much exactly the same thing with just a utility HOF. There's one that behaves right in `toolz`/`cytoolz`, or I've used this one in some publications and teaching material: def compose(*funcs): """Return a new function s.t. compose(f,g,...)(x) == f(g(...(x))) """ def inner(data, funcs=funcs): result = data for f in reversed(funcs): result = f(result) return result return inner -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Fri, Aug 18, 2017 at 10:33:40PM -0700, David Mertz wrote:
A ~~slight~~ major nit: given the implementation of compose you quote below, this applies the functions in the wrong order. min() is called first, and map(str.strip) last. But apart from being completely wrong *wink* that's not too bad :-) Now we start bike-shedding the aethetics of what looks better and reads more nicely. Your version is pretty good, except: 1) The order of function composition is backwards to that normally expected (more on this below); 2) there's that unfortunate call to "compose" which isn't actually part of the algorithm, its just scaffolding to make it work; 3) the data being operated on still at the far end of the chain, instead of the start; 4) and I believe that teaching a chain of function calls is easier than teaching higher order function composition. Much easier. The standard mathematical definition of function composition operates left to right: (f∘g∘h)(x) = f(g(h(x)) http://mathworld.wolfram.com/Composition.html And that's precisely what your implementation does. Given your implementation quoted below: py> def add_one(x): return x + 1 ... py> def double(x): return 2*x ... py> def take_one(x): return x - 1 ... py> py> compose(add_one, ... double, ... take_one)(10) 19 py> py> add_one(double(take_one(10))) 19 which is the mathematically expected behaviour. But for chaining, we want the operations in the opposite order: 10 -> add_one -> double -> take_one which is equivalent to: take_one(double(add_one(10)) So to use composition for chaining, we need: - a non-standard implementation of chaining, which operates in the reverse to what mathematicians and functional programmers expect; - AND remember to use this rcompose() instead of compose() - stick to the standard compose(), but put the functions in the reverse order to what we want; - or use the standard compose, but use even more scaffolding to make it work: result = compose(*reversed( ( map(str.strip), filter(lambda s: not startswith('#')), sorted, collapse, extract_dates, map(date_to_seconds), min )))(myfile.readlines())
-- Steve

On Sat, Aug 19, 2017 at 09:05:36AM -0700, David Mertz wrote:
You are right, of course. Mine does the order wrong. But an 'rcompose()' or 'pipe()' or 'funchain()' is easy enough to put in the right order.
Indeed. I said earlier that your solution (corrected for its error) was a pretty neat solution, and it was mostly down to a sense of aethetics which we might prefer. I think a pipe or arror is aethetically nicer, and speaks much more closely to the intent. Analogy: We don't need operators + - * / etc, since it's trivial to get the same effect using the functions in the operator module. But operators look nicer and are closer to the way people think of arithmetic. I think that function composition is a neat and powerful tool for those who already think functionally, but higher order functions are harder to teach and even experts can mess them up. (The lesson here is that the pipe operator | is like a postfix version of the composition operator ∘ .) -- Steve

On Aug 19, 2017 3:44 AM, "Steven D'Aprano" <steve@pearwood.info> wrote: 2) there's that unfortunate call to "compose" which isn't actually part of the algorithm, its just scaffolding to make it work; I see this as an ADVANTAGE, actually. We can save the composed function under another name before applying it to various data later. Or 'rcomposed' or whatever name. Moreover, composition is associative. op1 = compose(a, b, c) op2 = compose(d, e, f) op3 = compose(op1, op2) This is useful for creating compound operations that might be useful in themselves. The pipe operator doesn't lends itself nearly as well to this scenario. FWIW, while I think using a different function name is better, you could use a 'reversed=True' keyword argument on a compose() function.

Hello Thautwarm, and welcome! Sorry for the delay in responding, but this has been a very busy week for me personally, and an even busier week for my inbox, and so I missed your post until now. On Sun, Aug 13, 2017 at 12:49:45PM +0000, 王宣 ? wrote:
It is really good to see some actual practical experiments for these features, rather than just talking about them. Thank you! [...]
This has been suggested a few times. The first time, I disliked it, but I've come across to seeing its value. I like it. I wonder: could we make the "where" clause delay evaluation until the entire block was compiled, so that we could write something like this: S = (2*S_top + S_side) where: S_top = pi*r**2 S_side = C * h # C is defined further on C = 2*pi*r That's more how "where" is used mathematically.
I'm afraid I can't make heads or tails of that. Apart from guessing that it creates a function, I have no idea what it would do.
I like the idea of chained function calls, like pipes in shell languages such as bash. I've written a proof-of-concept for that: http://code.activestate.com/recipes/580625-collection-pipeline-in-python/ I prefer | to -> but that's just a personal preference. I don't like the use of _ in there. Underscore already has a number of special meanings, such as: - a convention for "don't care" - in the interactive interpreter, the last value calculated - used for internationalisation I don't think that giving _ yet another special meaning, and this one built in to the language, is a good idea.
This is a hard problem to deal with, but "condic" sounds awful. What is is supposed to mean? Short for "condition"?
I don't know how to read those. [...]
I find that almost unreadable. Too many new features all at once, it's like trying to read a completely unfamiliar language. How would you translate that into regular Python? Thanks for your experiments! -- Steve

On Fri, Aug 18, 2017 at 10:06 PM, Steven D'Aprano <steve@pearwood.info> wrote:
AIUI it's not a new meaning, but another variant of the second of those examples: it means "the last value calculated". However, I'd prefer to see it done with something that's otherwise illegal syntax - so unless the expression is to the right of a "->", you cannot use that symbol in that way. I'm on the fence as to whether it'd be better to allow an implicit last argument (or implicit first argument), so you can say "-> list()" without the symbol. ChrisA

# arrow transform (to avoid endless parentheses and try to be more readable.
parentheses aren't that bad, and as far as I can tell, this is just another way to call a function on the results of a function. The above is now spelled: list(map(lambda x: x+2, range(5))) which seems fine with me -- the only improvement I see is a more compact way to spell lambda. (though really, a list comp is considered more "pythonic" these days, yes? [x+2 for x in range(5)] nicely, we have list comps and generator expressions, so we can avoid the list0 call. I know this was a simple example for demonstration's sake, but doesn't look like an improvement to me. Of course, in this case, it's chaining iterations, not "ordinary" functions, so maybe would make more sense in other contexts. Also, we need to remember that functions can take *args, **kwargs, etc, and can return a tuple of just about anything -- not sure how well that maps to the "pipe" model. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Fri, Aug 18, 2017 at 11:47:40AM -0700, Chris Barker wrote:
I wouldn't say that parens are evil, but they're pretty noisy and distracting. I remember an old joke that claimed to prove that the US Defence Department was using Lisp for the SDI ("Star Wars") software: somebody had found a page covered completely edge to edge in nothing but closing brackets: )))))))))))))))))))))))))))))))))))))))) )))))))))))))))))))))))))))))))))))))))) ))))))))))))))))))))) ... etc Your example has a fairly short pipeline of calls:
list(map(lambda x: x+2, range(5)))
But even this has two clear problems: - the trailing brackets ))) are just noise, like the SDI joke above; - you have to read it backwards, right to left, to make sense of it. Imagine if you had a chain of ten or twenty calls: )))))))))) ... you get the picture But ultimately that's a relatively minor nuisance rather than a major problem. The thing that makes long chains of function calls painful is that you have to read them backwards: - first range() is called; - then map; - finally list even though we write them in the opposite order. When we reason about the code, say to write it in the first place, or to read the expression and understand it, I would guess that most people reason something like this: - start with our input data, range() - call map on it to generate new values; - call list to generate a list. When writing code like this, I frequently find myself having to work backwards compared to how we write the order of function calls: range(5) # move editor insertion point backwards map(...) # move editor insertion point backwards list(...) Half of my key presses are moving backwards over code I've just written to insert a function call which is executed *after* what I wrote, but needs to be written *before* what I just wrote. For a short example like this, where we can easily keep the three function calls in short-term memory, it isn't so bad, but short-term memory is very limited ("magic number seven, plus or minus two") and if you're already thinking about a couple of previous operations on earlier lines of code, you don't have a lot of stack space left for a long chain of operations. And that's why we often fall back to temporary variables and an imperative style: data = range(5) data = map(..., data) data = list(data) Perhaps not in such a short example, but for longer ones, very frequently. We can write the code in the same order that it is executed with a pipeline and avoid needing to push functions into our short-term memory when either reading or writing: range(5) -> map(lambda...) -> list This way of thinking combines the stengths of postfix notation and function call notation, without the disadvantages of either. This is very successful in shell scripting languages like bash. I don't want to oversell it as a panacea that solves everything, but it really is a powerful (and underused) software paradigm.
Aye, for such a sort example. But consider a longer one: find the earliest date in a bunch of lines of text: result = (myfile.readlines() -> map(str.strip) -> filter( lambda s: not s.startwith('#') ) -> sorted -> collapse # collapse runs of identical lines -> extract_dates -> map(date_to_seconds) -> min ) (I've assumed that the functions map and filter have some sort of automatic currying, like in Haskell; if you don't like that, then just pretend I spelled them Map and Filter instead :-) That's nice and easy to read and write: I wrote down exactly the steps I would have taken to solve the problem, in the same order that they need to be taken. Formatting is a breeze: the hardest decision was how far to indent subsequent lines. Compare it to this: result = min(map(date_to_seconds, extract_dates(collapse(sorted( filter(lambda s: not s.startswith('#'), map(str.strip, myfile.readlines()))))))) You have to read all the way to the end to find out the most important part, namely what data you are operating on! And then you have to read backwards to understand what is done to the data. And finally you have to be prepared for a whole lot of arguments from your co-workers about how to format it :-) # Either the ugliest thing ever, or the One True Way result = min( map( date_to_seconds, extract_dates( collapse( sorted( filter( lambda s: not s.startswith('#'), map( str.strip, myfile.readlines() ) ) ) ) ) ) ) [...]
Not everything maps well to the function pipeline model. But enough things do that I believe it is a powerful tool in the programmers toolkit. -- Steve

This is pretty easy to write without any syntax changes, just using a higher-order function `compose()` (possible implementation at foot). Again, I'll assume auto-currying like the map/filter versions of those functions in toolz, as Steven does:
result = compose(map(str.strip), filter(lambda s: not startswith('#'), sorted, collapse, extract_dates, map(date_to_seconds), min )(myfile.readlines()) Pretty much exactly the same thing with just a utility HOF. There's one that behaves right in `toolz`/`cytoolz`, or I've used this one in some publications and teaching material: def compose(*funcs): """Return a new function s.t. compose(f,g,...)(x) == f(g(...(x))) """ def inner(data, funcs=funcs): result = data for f in reversed(funcs): result = f(result) return result return inner -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Fri, Aug 18, 2017 at 10:33:40PM -0700, David Mertz wrote:
A ~~slight~~ major nit: given the implementation of compose you quote below, this applies the functions in the wrong order. min() is called first, and map(str.strip) last. But apart from being completely wrong *wink* that's not too bad :-) Now we start bike-shedding the aethetics of what looks better and reads more nicely. Your version is pretty good, except: 1) The order of function composition is backwards to that normally expected (more on this below); 2) there's that unfortunate call to "compose" which isn't actually part of the algorithm, its just scaffolding to make it work; 3) the data being operated on still at the far end of the chain, instead of the start; 4) and I believe that teaching a chain of function calls is easier than teaching higher order function composition. Much easier. The standard mathematical definition of function composition operates left to right: (f∘g∘h)(x) = f(g(h(x)) http://mathworld.wolfram.com/Composition.html And that's precisely what your implementation does. Given your implementation quoted below: py> def add_one(x): return x + 1 ... py> def double(x): return 2*x ... py> def take_one(x): return x - 1 ... py> py> compose(add_one, ... double, ... take_one)(10) 19 py> py> add_one(double(take_one(10))) 19 which is the mathematically expected behaviour. But for chaining, we want the operations in the opposite order: 10 -> add_one -> double -> take_one which is equivalent to: take_one(double(add_one(10)) So to use composition for chaining, we need: - a non-standard implementation of chaining, which operates in the reverse to what mathematicians and functional programmers expect; - AND remember to use this rcompose() instead of compose() - stick to the standard compose(), but put the functions in the reverse order to what we want; - or use the standard compose, but use even more scaffolding to make it work: result = compose(*reversed( ( map(str.strip), filter(lambda s: not startswith('#')), sorted, collapse, extract_dates, map(date_to_seconds), min )))(myfile.readlines())
-- Steve

On Sat, Aug 19, 2017 at 09:05:36AM -0700, David Mertz wrote:
You are right, of course. Mine does the order wrong. But an 'rcompose()' or 'pipe()' or 'funchain()' is easy enough to put in the right order.
Indeed. I said earlier that your solution (corrected for its error) was a pretty neat solution, and it was mostly down to a sense of aethetics which we might prefer. I think a pipe or arror is aethetically nicer, and speaks much more closely to the intent. Analogy: We don't need operators + - * / etc, since it's trivial to get the same effect using the functions in the operator module. But operators look nicer and are closer to the way people think of arithmetic. I think that function composition is a neat and powerful tool for those who already think functionally, but higher order functions are harder to teach and even experts can mess them up. (The lesson here is that the pipe operator | is like a postfix version of the composition operator ∘ .) -- Steve

On Aug 19, 2017 3:44 AM, "Steven D'Aprano" <steve@pearwood.info> wrote: 2) there's that unfortunate call to "compose" which isn't actually part of the algorithm, its just scaffolding to make it work; I see this as an ADVANTAGE, actually. We can save the composed function under another name before applying it to various data later. Or 'rcomposed' or whatever name. Moreover, composition is associative. op1 = compose(a, b, c) op2 = compose(d, e, f) op3 = compose(op1, op2) This is useful for creating compound operations that might be useful in themselves. The pipe operator doesn't lends itself nearly as well to this scenario. FWIW, while I think using a different function name is better, you could use a 'reversed=True' keyword argument on a compose() function.
participants (6)
-
Chris Angelico
-
Chris Barker
-
David Mertz
-
MRAB
-
Steven D'Aprano
-
王宣 ?