Reduce/fold and scan with generator expressions and comprehensions
The idea is to let generator expressions and list/set comprehensions have a clean syntax to access its last output. That would allow them to be an alternative syntax to the scan higher-order function [1] (today implemented in the itertools.accumulate function), which leads to an alternative way to write a fold/reduce. It would be nice to have something like:
last(abs(prev - x) for x in [3, 4, 5] from prev = 2) 2
instead of a reduce:
from functools import reduce reduce(lambda prev, x: abs(prev - x), [3, 4, 5], 2) 2
or an imperative approach:
prev = 2 for x in [3, 4, 5]: ... prev = abs(prev - x) prev 2
or getting the last from accumulate:
from itertools import accumulate list(accumulate([2, 3, 4, 5], lambda prev, x: abs(prev - x)))[-1] 2
or...
[prev for prev in [2] ... for x in [3, 4, 5] ... for prev in [abs(prev - x)] ... ][-1] 2
Actually, I already wrote a solution for something similar to that: PyScanPrev [2]. I'm using bytecode manipulation to modify the generator expression and set/list comprehensions semantics to create a "scan", but it has the limitation of using only code with a valid syntax as the input, so I can't use "from" inside a generator expression / list comprehension. The solution was to put the first output into the iterable and define the "prev" name elsewhere:
last(abs(prev - x) for x in [2, 3, 4, 5]) 2
That line works with PyScanPrev (on Python 3.4 and 3.5) when defined in a function with a @enable_scan("prev") decorator. That was enough to create a "test suite" of doctest-based examples that shows several scan use cases [2]. This discussion started in a Brazilian list when someone asked how she could solve a simple uppercase/lowercase problem [3]. The goal was to alternate the upper/lower case of a string while neglecting the chars that doesn't apply (i.e., to "keep the state" when the char isn't a letter). After the discussion, I wrote the PyScanPrev package, and recently I've added this historical "alternate" function as the "conditional toggling" example [4]. Then I ask, can Python include that "scan" access to the last output in its list/set/dict comprehension and generator expression syntax? There are several possible applications for the scan itself as well as for the fold/reduce (signal processing, control theory, physics, economics, etc.), some of them I included as PyScanPrev examples. Some friends (people who like control engineering and/or signal processing) liked the "State-space model" example, where I included a "leaking bucket-spring-damper" simulation using the scan-enabled generator expressions [5]. About the syntax, there are several ideas on how that can be written. Given a "prev" identifier, a "target" identifier, an input "iterable" and an optional "start" value (and perhaps an optional "echo_start", which I assume True by default), some of them are: [func(prev, target) for target in iterable from prev = start] [func(prev, target) for target in iterable] -> prev = start [func(prev, target) for target in iterable] -> prev as start [func(prev, target) for target in iterable] from prev = start [func(prev, target) for target in iterable] from prev as start [func(prev, target) for target in iterable] with prev as start prev = start -> [func(prev, target) for target in iterable] prev(start) -> [func(prev, target) for target in iterable] [func(prev, target) for prev -> target in start -> iterable] [prev = start -> func(prev, target) for target in iterable] # With ``start`` being the first value of the iterable, i.e., # iterable = prepend(start, data) [func(prev, target) for target in iterable from prev] [func(prev, target) for target in iterable] -> prev [func(prev, target) for target in iterable] from prev prev -> [func(prev, target) for target in iterable] Before writing PyScanPrev, in [6] (Brazilian Portuguese) I used stackfull [7] to implement that idea, an accumulator example using that library is:
from stackfull import push, pop, stack [push(pop() + el if stack() else el) for el in range(5)] [0, 1, 3, 6, 10] list(itertools.accumulate(range(5))) [0, 1, 3, 6, 10]
There are more I can say (e.g. the pyscanprev.scan function has a "start" value and an "echo_start" keyword argument, resources I missed in itertools.accumulate) but the links below already have a lot of information. [1] https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function [2] https://pypi.python.org/pypi/pyscanprev [3] https://groups.google.com/forum/#!topic/grupy-sp/wTIj6G5_5S0 [4] https://github.com/danilobellini/pyscanprev/blob/v0.1.0/examples/conditional... [5] https://github.com/danilobellini/pyscanprev/blob/v0.1.0/examples/state-space... [6] https://groups.google.com/forum/#!topic/grupy-sp/UZp-lVSWK1s [7] https://pypi.python.org/pypi/stackfull -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
What is `last(inf_iter)`. E.g `last(count())`. To me, the obvious spelling is: for last in it: pass doSomething(last) This makes it clear that is the users job to make sure `it` terminates. There's no general way to get the last item without looking through all the earlier ones. On Oct 23, 2016 7:58 AM, "Danilo J. S. Bellini" <danilo.bellini@gmail.com> wrote:
The idea is to let generator expressions and list/set comprehensions have a clean syntax to access its last output. That would allow them to be an alternative syntax to the scan higher-order function [1] (today implemented in the itertools.accumulate function), which leads to an alternative way to write a fold/reduce. It would be nice to have something like:
last(abs(prev - x) for x in [3, 4, 5] from prev = 2) 2
instead of a reduce:
from functools import reduce reduce(lambda prev, x: abs(prev - x), [3, 4, 5], 2) 2
or an imperative approach:
prev = 2 for x in [3, 4, 5]: ... prev = abs(prev - x) prev 2
or getting the last from accumulate:
from itertools import accumulate list(accumulate([2, 3, 4, 5], lambda prev, x: abs(prev - x)))[-1] 2
or...
[prev for prev in [2] ... for x in [3, 4, 5] ... for prev in [abs(prev - x)] ... ][-1] 2
Actually, I already wrote a solution for something similar to that: PyScanPrev [2]. I'm using bytecode manipulation to modify the generator expression and set/list comprehensions semantics to create a "scan", but it has the limitation of using only code with a valid syntax as the input, so I can't use "from" inside a generator expression / list comprehension. The solution was to put the first output into the iterable and define the "prev" name elsewhere:
last(abs(prev - x) for x in [2, 3, 4, 5]) 2
That line works with PyScanPrev (on Python 3.4 and 3.5) when defined in a function with a @enable_scan("prev") decorator. That was enough to create a "test suite" of doctest-based examples that shows several scan use cases [2].
This discussion started in a Brazilian list when someone asked how she could solve a simple uppercase/lowercase problem [3]. The goal was to alternate the upper/lower case of a string while neglecting the chars that doesn't apply (i.e., to "keep the state" when the char isn't a letter). After the discussion, I wrote the PyScanPrev package, and recently I've added this historical "alternate" function as the "conditional toggling" example [4].
Then I ask, can Python include that "scan" access to the last output in its list/set/dict comprehension and generator expression syntax? There are several possible applications for the scan itself as well as for the fold/reduce (signal processing, control theory, physics, economics, etc.), some of them I included as PyScanPrev examples. Some friends (people who like control engineering and/or signal processing) liked the "State-space model" example, where I included a "leaking bucket-spring-damper" simulation using the scan-enabled generator expressions [5].
About the syntax, there are several ideas on how that can be written. Given a "prev" identifier, a "target" identifier, an input "iterable" and an optional "start" value (and perhaps an optional "echo_start", which I assume True by default), some of them are:
[func(prev, target) for target in iterable from prev = start] [func(prev, target) for target in iterable] -> prev = start [func(prev, target) for target in iterable] -> prev as start [func(prev, target) for target in iterable] from prev = start [func(prev, target) for target in iterable] from prev as start [func(prev, target) for target in iterable] with prev as start prev = start -> [func(prev, target) for target in iterable] prev(start) -> [func(prev, target) for target in iterable] [func(prev, target) for prev -> target in start -> iterable] [prev = start -> func(prev, target) for target in iterable]
# With ``start`` being the first value of the iterable, i.e., # iterable = prepend(start, data) [func(prev, target) for target in iterable from prev] [func(prev, target) for target in iterable] -> prev [func(prev, target) for target in iterable] from prev prev -> [func(prev, target) for target in iterable]
Before writing PyScanPrev, in [6] (Brazilian Portuguese) I used stackfull [7] to implement that idea, an accumulator example using that library is:
from stackfull import push, pop, stack [push(pop() + el if stack() else el) for el in range(5)] [0, 1, 3, 6, 10] list(itertools.accumulate(range(5))) [0, 1, 3, 6, 10]
There are more I can say (e.g. the pyscanprev.scan function has a "start" value and an "echo_start" keyword argument, resources I missed in itertools.accumulate) but the links below already have a lot of information.
[1] https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function [2] https://pypi.python.org/pypi/pyscanprev [3] https://groups.google.com/forum/#!topic/grupy-sp/wTIj6G5_5S0 [4] https://github.com/danilobellini/pyscanprev/blob/ v0.1.0/examples/conditional-toggling.rst [5] https://github.com/danilobellini/pyscanprev/blob/ v0.1.0/examples/state-space.rst [6] https://groups.google.com/forum/#!topic/grupy-sp/UZp-lVSWK1s [7] https://pypi.python.org/pypi/stackfull
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
What is `last(inf_iter)`. E.g `last(count())`.
The "last" is just a helper function that gets the last value of an iterable. On sequences, it can be written to get the item at index -1 to avoid traversing it. Using it on endless iterables makes no sense. This makes it clear that is the users job to make sure `it` terminates. If one call "last" for something that doesn't terminate, an "endless" iterable, well, it's pretty obvious that it won't "end" nicely. It's not the Python job to solve the Entscheidungsproblem. If you call "sorted" on endless iterables, it would behave like "last", doesn't it? The whole point of this idea is the scan as a generator expression or list/set comprehension that can access the previous iteration output. Reduce/fold is just the last value of a scan, and the scan is still defined when there's no "last value". -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Of course. But if you want last(), why not just spell the utility function as I did? I.e. as a function: def last(it): for item in it: pass return item That works fine for any iteratable (including a list, array, etc), whether or not it's a reduction/accumulation. On Oct 23, 2016 8:29 AM, "Danilo J. S. Bellini" <danilo.bellini@gmail.com> wrote:
What is `last(inf_iter)`. E.g `last(count())`.
The "last" is just a helper function that gets the last value of an iterable. On sequences, it can be written to get the item at index -1 to avoid traversing it. Using it on endless iterables makes no sense.
This makes it clear that is the users job to make sure `it` terminates.
If one call "last" for something that doesn't terminate, an "endless" iterable, well, it's pretty obvious that it won't "end" nicely. It's not the Python job to solve the Entscheidungsproblem. If you call "sorted" on endless iterables, it would behave like "last", doesn't it?
The whole point of this idea is the scan as a generator expression or list/set comprehension that can access the previous iteration output. Reduce/fold is just the last value of a scan, and the scan is still defined when there's no "last value".
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Sun, Oct 23, 2016 at 08:37:07AM -0700, David Mertz wrote:
Of course. But if you want last(), why not just spell the utility function as I did? I.e. as a function:
def last(it): for item in it: pass return item
That works fine for any iteratable (including a list, array, etc), whether or not it's a reduction/accumulation.
That's no good, because it consumes the iterator. Yes, you get the last value, but you actually needed to do work on all the previous values too. -- Steve
Consuming the iterator is *necessary* to get the last item. There's no way around that. Obviously, you could itertools.tee() it first if you don't mind the cache space. But there cannot be a generic "jump to the end" of an iterator without being destructive. On Oct 23, 2016 8:43 AM, "Steven D'Aprano" <steve@pearwood.info> wrote:
On Sun, Oct 23, 2016 at 08:37:07AM -0700, David Mertz wrote:
Of course. But if you want last(), why not just spell the utility function as I did? I.e. as a function:
def last(it): for item in it: pass return item
That works fine for any iteratable (including a list, array, etc), whether or not it's a reduction/accumulation.
That's no good, because it consumes the iterator. Yes, you get the last value, but you actually needed to do work on all the previous values too.
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Oct 23, 2016 at 08:47:12AM -0700, David Mertz wrote:
Consuming the iterator is *necessary* to get the last item. There's no way around that.
Obviously, you could itertools.tee() it first if you don't mind the cache space. But there cannot be a generic "jump to the end" of an iterator without being destructive.
Right. But you're missing the point of Danilo's proposal. He isn't asking for a function to "jump to the end" of an iterator. Look at his example. The word "last" is a misnomer: he seems to me talking about having a special variable in comprehensions that holds the *previous* value of the loop variable, with special syntax to set its FIRST value, before the loop is entered. So "last" is a misleading name, unless you understand it as "last seen" rather than "very last, at the end". So given an iterator [1, 2, 4, 8], and an initial value of -1, we would see something like this: [(previous, this) for this in [1, 2, 4, 8] with previous as -1] # or some other syntax returns: [(-1, 1), (1, 2), (2, 4), (4, 8)] So a dedicated function that does nothing but scan to the end of the iterator and return the last/final value seen is no alternative to his proposal. -- Steve
On Mon, Oct 24, 2016 at 10:22 AM, Steven D'Aprano <steve@pearwood.info> wrote:
Right. But you're missing the point of Danilo's proposal. He isn't asking for a function to "jump to the end" of an iterator. Look at his example. The word "last" is a misnomer: he seems to me talking about having a special variable in comprehensions that holds the *previous* value of the loop variable, with special syntax to set its FIRST value, before the loop is entered. So "last" is a misleading name, unless you understand it as "last seen" rather than "very last, at the end".
Sounds like the PostgreSQL "lag" function [1]. Perhaps that's a better name? Conceptually, what you have is another iteration point that lags behind where you currently are. ChrisA [1] https://www.postgresql.org/docs/current/static/functions-window.html
On Mon, Oct 24, 2016 at 10:22:32AM +1100, Steven D'Aprano wrote:
On Sun, Oct 23, 2016 at 08:47:12AM -0700, David Mertz wrote:
Consuming the iterator is *necessary* to get the last item. There's no way around that.
Obviously, you could itertools.tee() it first if you don't mind the cache space. But there cannot be a generic "jump to the end" of an iterator without being destructive.
Right. But you're missing the point of Danilo's proposal.
Ah, actually it may be that I have misunderstood Danilo's proposal, because his example does include BOTH a suggestion of new magic syntax for retrieving the *previous* loop value inside a comprehension AND what seems to be a new built-in(?) function last() which seems to do exactly what you suggest: jump right to the end of an iterable and return the final value. My apologies for the confusion. -- Steve
On Sun, Oct 23, 2016 at 4:22 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Right. But you're missing the point of Danilo's proposal. He isn't asking for a function to "jump to the end" of an iterator. Look at his example. The word "last" is a misnomer: he seems to me talking about having a special variable in comprehensions that holds the *previous* value of the loop variable, with special syntax to set its FIRST value, before the loop is entered.
OK... but that's exactly itertools.accumulate (or maybe a thin wrapper around it I like showed earlier in the thread). I'm not sure Danilo was clear in what he's proposing. In the thread he suggested that he wanted to special case to indexing on sequences, which doesn't seem to make sense for your meaning. It feels like there might be a case here for a new function in itertools that makes use of the last-seen item in an iterable, then combines it somehow with the current item. I'm not sure the spelling, but it definitely sounds like a function to me, not a need for new syntax. I've only rarely had that specific need. That said, here's a function I use in teaching to show some of what you can do with combining iterators, especially using itertools: def item_with_total(iterable): "Generically transform a stream of numbers into a pair of (num, running_sum)" s, t = tee(iterable) yield from zip(t, accumulate(s)) This might not be *exactly* what Danilo wants, but it's a similar concept. I wrap together an iterator (including an infinite one) with an accumulation. This just takes the default `operator.add` function for accumulate(), but it could take a function argument easily enough. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
Of course. But if you want last(), why not just spell the utility function as I did? [...]
I'm not against a general "last", I just said the main idea of this thread is the access to the previous iteration output in a list/set/dict comprehension or generator expression.
Actually, your code is similar to the reference implementation I wrote for PyScanPrev, the main difference is that my "last" raises a StopIteration on an empty input instead of an UnboundLocalError: https://github.com/danilobellini/pyscanprev/blob/v0.1.0/pyscanprev.py#L148 When the input is a sequence, it should be optimized to get the item at the index -1.
That works fine for any iteratable (including a list, array, etc), whether
or not it's a reduction/accumulation.
Lists and arrays don't need to be traversed. Consuming the iterator is *necessary* to get the last item. There's no way
around that.
Not if there's enough information to create the last value. Perhaps on the it = iter(range(9999999)) one can get 2 values (call next(it) twice) and use its __length_hint__ to create the last value. But I think only sequences should have such an optimization, not iterators. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Sure, a better last() should try to index into it[-1] first as an efficiency. And there are lots of iterators where the last element is predictable without looking through all the prior items. I know the last item of itertools.repeat(7, sys.maxsize) without having to loop for hours. But the general case is that you need to get all the head elements to determine last(). If the main idea is "to access the previous iteration" then we already have it retools.accumulate() for exactly that purpose. If you want something that bandages a little different from accumulate, writing generator functions us really easy. On Oct 23, 2016 8:59 AM, "Danilo J. S. Bellini" <danilo.bellini@gmail.com> wrote:
Of course. But if you want last(), why not just spell the utility function
as I did? [...]
I'm not against a general "last", I just said the main idea of this thread is the access to the previous iteration output in a list/set/dict comprehension or generator expression.
Actually, your code is similar to the reference implementation I wrote for PyScanPrev, the main difference is that my "last" raises a StopIteration on an empty input instead of an UnboundLocalError: https://github.com/danilobellini/pyscanprev/blob/v0.1.0/pyscanprev.py#L148 When the input is a sequence, it should be optimized to get the item at the index -1.
That works fine for any iteratable (including a list, array, etc), whether
or not it's a reduction/accumulation.
Lists and arrays don't need to be traversed.
Consuming the iterator is *necessary* to get the last item. There's no way
around that.
Not if there's enough information to create the last value. Perhaps on the it = iter(range(9999999)) one can get 2 values (call next(it) twice) and use its __length_hint__ to create the last value. But I think only sequences should have such an optimization, not iterators.
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Sun, Oct 23, 2016 at 12:57:10PM -0200, Danilo J. S. Bellini wrote:
The idea is to let generator expressions and list/set comprehensions have a clean syntax to access its last output. That would allow them to be an alternative syntax to the scan higher-order function [1] (today implemented in the itertools.accumulate function), which leads to an alternative way to write a fold/reduce. It would be nice to have something like:
[cut suggested syntax]
instead of a reduce:
[cut four existing ways to solve the problem] Why do we need a FIFTH way to solve this problem? What you are describing is *exactly* the use case for a reduce or fold function. Why add special magic syntax to make comprehensions do even more? Not everything needs to be a one liner. It's okay to import reduce to do a reduce. Its okay to write an actual for-loop.
Actually, I already wrote a solution for something similar to that: PyScanPrev [2].
Ah, that makes SIX existing solutions. Do we need a seventh? -- Steve
Ah, that makes SIX existing solutions. Do we need a seventh?
It might have dozens of solutions, perhaps an infinity of solutions. Brainfuck and assembly can be used, or even turing machine instructions... But there should be one, and preferably only one, OBVIOUS way to do it. Readability counts. Reduce lost the built-in status on Python 3. Lambdas lost the decomposing arguments like "lambda (a, b), c: a + b * c". Using a for loop section inside a generator expression or list/set/dict comprehension allow the decomposing arguments and doesn't need a function to be imported. Actually, itertools.accumulate and functools.reduce have their parameters reversed, and accumulate doesn't have a "start" parameter. Actually, the example I give in this thread is about a fold/reduce trying to show it's way simpler than the other solutions. I didn't paste here any scan use case because I sent links with several use cases, should I paste their contents here? The PyScanPrev link (https://github.com/ danilobellini/pyscanprev) has several use case examples (including some just for a comparison with other possible solutions) and even have a full rationale for this idea. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Oct 23, 2016 9:12 AM, "Danilo J. S. Bellini" <danilo.bellini@gmail.com> wrote: Actually, itertools.accumulate and functools.reduce have their parameters reversed, and accumulate doesn't have a "start" parameter. def accumulate2(fn=operator.add, it, start=None): if start is not None: it = iterations.chain([start], it) return itertools.accumulate(it, fn) I would have preferred this signature to start with, but it's easy to wrap.
I would have preferred this signature to start with, but it's easy to wrap.
Indeed, but a default value for the first argument requires a default value for all arguments. It's a syntax error, but I agree a "range-like" signature like that would be better. My reference scan implementation (that's how I thought itertools.accumulate should be): https://github.com/danilobellini/pyscanprev/blob/v0.1.0/pyscanprev.py#L171 A new "functools.scan" with a signature like the one from the link above would be nice, but it would overlap with itertools.accumulate in some sense. The advantages would be: 1 - The scan signature and the functools.reduce signature are the same (the function as the first parameter, like map/filter) 2 - The module, functools, is the same that has the reduce function -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On 23 October 2016 at 17:10, Danilo J. S. Bellini <danilo.bellini@gmail.com> wrote:
Ah, that makes SIX existing solutions. Do we need a seventh?
It might have dozens of solutions, perhaps an infinity of solutions. Brainfuck and assembly can be used, or even turing machine instructions...
But there should be one, and preferably only one, OBVIOUS way to do it. Readability counts.
Sure, but you haven't explained why your proposal is more obvious than any of the other six. Built in does not equate to obvious. More obvious is often to have a dedicated tool, in a module designed to provide tools in that particular area. That's partially why reduce got moved to the functools module (another part is the fact that Guido doesn't find functional-style approaches that "obvious" - and what's obvious to a Dutchman is the benchmark here :-)) I'm not against powerful "windowed function" capabilities - my background is SQL, and windowed functions in SQL have even more power than the sort of thing we're talking about here. But I wouldn't call them "obvious" - at least not based on the number of times I get to do explanations of them to colleagues, or the number of tutorials on them I see. So the idea seems fine to me, but I'd very definitely class it as an "advanced" feature, and typically that sort of feature in Python is handled in a library.
Reduce lost the built-in status on Python 3. Lambdas lost the decomposing arguments like "lambda (a, b), c: a + b * c".
Which can be interpreted as evidence that this type of approach is not considered a core feature. In general, I like the idea, but I don't think it fits well in Python in its proposed form. Paul
On Sun, Oct 23, 2016 at 02:10:42PM -0200, Danilo J. S. Bellini wrote:
Ah, that makes SIX existing solutions. Do we need a seventh?
It might have dozens of solutions, perhaps an infinity of solutions. Brainfuck and assembly can be used, or even turing machine instructions...
No, those are *implementations*. You can implement your solution in any language you like. For integration with Python, any of C, Fortran, Rust, Julia, Cython and (of course) pure Python are proven to work well. Using Turing Machine instructions requires some sort of TM compiler... good luck with that. But you cut out the most important part of my post. You've given lots of existing solutions. Why aren't they satisfactory? Even if somebody wants to write in a functional style, the reduce() solution you show seems perfectly clean and conventional to anyone used to functional code: from functools import reduce reduce(lambda prev, x: abs(prev - x), [3, 4, 5], 2) returns 2. What is wrong with this solution? That is the obvious solution for somebody looking for a functional style: something called reduce or fold. And there's no harm in needing to import reduce. Not every function has to be a built-in. Whereas your suggestion needs TWO new features: new syntax: (abs(prev - x) for x in [3, 4, 5] from prev = 2) plus a new built-in function last() which extracts the final value from an iterator. That means you will risk encouraging people to wastefully generate a large container of unneeded and unwanted intermediate values: last([abs(prev - x) for x in range(100000) from prev = 2]) which will generate a list 100000 items long just to extract the final one. reduce() is better, that is exactly what reduce() is designed for.
But there should be one, and preferably only one, OBVIOUS way to do it. Readability counts.
Right. And the obvious way is the imperative approach (only this time I will use a better variable name): py> result = 2 py> for x in [3, 4, 5]: ... result = abs(result - x) ... py> result 2 For those who think in functional programming terms, reduce() is the obvious way. Also, I feel that your proposal could have been explained better. I felt overloaded by the sheer mass of different alternatives, and mislead by your use of the name "prev" for something that I see now on more careful reading is *not* the previous value of the loop variable (as I first understood) but the running calculation result. In fairness I am sick and if I were well I may have been able to keep this straight in my head, but calling the variable "prev" is actively misleading. I was mislead, and (I think) Chris who just suggested this was similar to the SQL "lag" function may have been mislead as well. (Or perhaps he was just mislead by me, in which case, sorry Chris!) -- Steve
On Mon, Oct 24, 2016 at 11:29 AM, Steven D'Aprano <steve@pearwood.info> wrote:
In fairness I am sick and if I were well I may have been able to keep this straight in my head, but calling the variable "prev" is actively misleading. I was mislead, and (I think) Chris who just suggested this was similar to the SQL "lag" function may have been mislead as well. (Or perhaps he was just mislead by me, in which case, sorry Chris!)
All of the above are possible. I'm skimming the thread, not reading it in full, and I'm a bit lost as to the point of the proposal, so it's entirely possible that lag() is unrelated. But if it is indeed just reduce(), then it's even simpler. ChrisA
The proposal is mostly about scan/accumulate. Reduce/fold is a "corollary", as it's just the last value of a scan. The idea is to have a way of using the previous iteration output inside a list comprehension (and anything alike). That is, to make them recursive. last([abs(prev - x) for x in range(100000) from prev = 2])
Why not [abs(prev - x) for x in range(100000) from prev = 2][-1]? How about list(some_iterable)[-1]? Probably a "last" function would avoid these. But the "last" is a pretty easy function to write. This proposal is about the list comprehension syntax (and other things alike). The "last" function and the "scan" functions can be seen as secondary proposals, the main point is a syntax to access to the previous iteration output value inside a list comprehension. For example, a product:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
That makes sense for me, and seem simpler than:
from itertools import accumulate, chain list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k)) [1, 5, 10, 40, 120]
Which is still simpler than using reduce
from functools import reduce list(reduce(lambda hist, k: hist + [hist[-1] * k], [5, 2, 4, 3], [1])) [1, 5, 10, 40, 120]
The first is explicit. The imperative approach for that would be much more like the reduce than the scan, as "hist" is the result.
hist = [1] for k in [5, 2, 4, 3]: ... prev = hist[-1] ... hist.append(prev * k) hist [1, 5, 10, 40, 120]
The very idea of prefering these approaches instead of the proposal sounds strange to me. What is simpler on them, the McCabe complexity? Number of tokens? Number of repeated tokens? AST tree height? AFAIK, GvR prefers the list comprehension syntax instead of using the map/filter higher order functions. He even said somewhere that a reduce can be written as list comprehension, and it wasn't obvious for me that a "3-for-sections" list comprehension repeating a target variable name would be a valid Python code, and that's required to get a recursive list comprehension. What I'm proposing is to allow a list comprehension syntax to behave like itertools.accumulate without the "3-for-sections" kludge. The rationale for the proposal is here: https://github.com/danilobellini/pyscanprev/tree/v0.1.0#the-world-without-th... On Haskell, the above example would be: Prelude> scanl (*) 1 [5, 2, 4, 3] [1,5,10,40,120] And that's what I'm trying to write as a list comprehension. Some months ago, thinking on how I could write this proposal, I started writing PyScanPrev. Among the examples I wrote on PyScanPrev, there are use cases on: - maths - physics - economics - string processing - signal processing - control engineering - dynamic / time-varying model simulation - gray code generation So I can say that's not niche/specific. The most sophisticated scan example I wrote is this one to plot the trajectory of a "leaking bucket-spring-damper" system: https://github.com/danilobellini/pyscanprev/blob/v0.1.0/examples/state-space... Lag and windowing seem unrelated, something that would be solved with itertools.tee, zip or perhaps a collections.deque and a function (and I remember of doing so on AudioLazy). 2016-10-23 22:38 GMT-02:00 Chris Angelico <rosuav@gmail.com>:
On Mon, Oct 24, 2016 at 11:29 AM, Steven D'Aprano <steve@pearwood.info> wrote:
In fairness I am sick and if I were well I may have been able to keep this straight in my head, but calling the variable "prev" is actively misleading. I was mislead, and (I think) Chris who just suggested this was similar to the SQL "lag" function may have been mislead as well. (Or perhaps he was just mislead by me, in which case, sorry Chris!)
All of the above are possible. I'm skimming the thread, not reading it in full, and I'm a bit lost as to the point of the proposal, so it's entirely possible that lag() is unrelated.
But if it is indeed just reduce(), then it's even simpler.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Danilo J. S. Bellini writes:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
Among the examples I wrote on PyScanPrev, there are use cases on: - maths - physics - economics
As a practicing economist, I wonder what use cases you're referring to. I can't think of any use cases where if one previous value is useful, having all previous values available (ie, an arbitrary temporal structure, at the modeler's option) isn't vastly more useful. This means that in modern econometrics, for example, simple procedures like Cochrane-Orcutt (which handles one previous value of the dependent variable in a single-equation regression) are subsumed in ARIMA and VAR estimation, which generalize the number of equations and/or the number of lags to greater than one. BTW, numerical accuracy considerations often mean you don't want to use the compact "for ... in ... if ..." expression syntax anyway, as accuracy can often be greatly improved with appropriate reordering of values in the series. Even "online" regression algorithms, where you might think to write ( updated_model(datum, prev) for datum in sensor_data() from prev = something ) 'prev' need to refer not to the previous value of 'datum', but to the previous value of 'updated_model()' (since you need a sufficient statistic for all previous data). And 'prev' as currently conceived is just plain useless for any long period moving average, etc. So in the end, even if there are plausible use cases for quick and dirty code, an experienced implementer wouldn't use them anyway as more powerful tools are likely to be immediately to hand.
As a practicing economist, I wonder what use cases you're referring to. I can't think of any use cases where if one previous value is useful, having all previous values available (ie, an arbitrary temporal structure, at the modeler's option) isn't vastly more useful.
Well, see the itertools.accumulate examples yourself then, the ones at docs.python.org... We can start with something really simple like interest rates or uniform series, but... before arguing here, please convince other people to update the Wikipedia: "Recurrence relations, especially linear recurrence relations, are used extensively in both theoretical and empirical economics." https://en.wikipedia.org/wiki/Recurrence_relation#Economics So in the end, even if there are plausible use cases for quick and dirty
code, [...]
The proposal isn't about quick and dirty code. The State-space example includes a general linear time-varying MIMO simulation implementation trying to keep the syntax as similar as possible to the control theory engineers are used to. Also, my goal when I was looking for a scan syntax to solve the conditional toggling example was to make it cleaner. If you aren't seeing the examples I wrote, I wonder what are you writing about. There was a discussion about this a while ago. And where's the link? [...]. From what
I remember, the conclusion reached was that there are too many degrees of freedom to be able to express reduction operations in a comprehension-like way that's any clearer
I don't know if that's a conclusion from any other thread, but that's wrong. The only extra "freedom" required a way to access the previous output (or "accumulator", "memory", "state"... you can use the name you prefer, but they're all the same). How many parameters does itertools.scan have? And map/filter? I can't talk about a discussion I didn't read, it would be unfair, disrespectful. Perhaps that discussion was about an specific proposal and not about the requirements to express a scan/fold. This proposal should be renamed to "recursive list comprehension". 3 words, and it's a complete description of what I'm talking about. For people from a functional programming background, that's about an alternative syntax to write the scan higher order function. Forget the word "reduce", some people here seem to have way too much taboo with that word, and I know there are people who would prefer a higher McCabe complexity just to avoid it. Perhaps there are people who prefer masochist rituals instead of using "reduce", who knows? Who cares? I like reduce, but I'm just proposing a cleaner syntax for recursive list comprehensions, and "reduce" isn't the general use case for that. On the contrary, "reduce" is just the specific scenario where only the last value matters. [...] you better have a VERY GOOD reason. I spent months writing PyScanPrev, mainly the examples in several reStructuredText files, not because I was forcing that, but because there are way too many use cases for it, and I know those examples aren't exhaustive. https://pypi.python.org/pypi/pyscanprev But talking about "good reasons" reminds me of the "annotations"! A function that uses annotations for one library can't use annotations for another library unless their annotation values are the same, but if one package/framework needs "type information" from your function parameters/result and another package/framework collects "documentation" from it, there's no way to get that working together. Something like "x : int = 2" makes the default assignment seem like an assignment to the annotation, and there's even a new token "->" for annotations. When I saw Python annotations at first I though it was a joke, now I know it's something serious with [mutually incompatible] libraries/packages using them. I strongly agree that everything should need a good reason, but I wrote a lot about the scan use cases and no one here seem to have read what I wrote, and the only reason that matters seem to be a kind of social status, not really "reason". I probably wrote way more reasons for that proposal than annotations could ever have. But if no one seem to care enough to read, then why should I insist? That's like my pprint bugfix patch some months ago, was it applied? AFAIK not even core developers giving +1 was enough for it to be applied. This maillist isn't very inviting... but I hope some of you at least try to read the rationale and the examples. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Danilo J. S. Bellini writes: No attribution. Please attribute, at least when you mix quotes from different people.
As a practicing economist, I wonder what use cases you're referring to. I can't think of any use cases where if one previous value is useful, having all previous values available (ie, an arbitrary temporal structure, at the modeler's option) isn't vastly more useful.
Well, see the itertools.accumulate examples yourself then, the ones at docs.python.org...
And this: "list(accumulate(data, max))", needs syntax, why? I scratch my head over how you can improve over that.
We can start with something really simple like interest rates or uniform series, but...
Don't waste the list's time being snide. My point is that although new syntax may be useful for simple cases, serious applications will worry about computational accuracy and likely will provide packages that handle general cases that nest these simple cases. Given they exist, most modelers will prefer using those packages to writing their own comprehensions. That may not apply in other fields, but AFAICS it does apply in economics. So if you can't handle the complex cases, the syntax is just cognitive overhead: TOOWTDI will be the high-quality general packages even where they're theoretically overkill. The basic list comprehension doesn't need to deal with that kind of issue.
before arguing here, please convince other people to update the Wikipedia:
Irrelevant and rude. Please, don't.
"Recurrence relations, especially linear recurrence relations, are used extensively in both theoretical and empirical economics." https://en.wikipedia.org/wiki/Recurrence_relation#Economics
I didn't contest that, as quoted above. What I contest is the claim that in empirical economics syntactic sugar for 'accumulate' would be particularly useful. Sure, you *can* express a second-order difference equation as two first-order equations, and perhaps you would actually calculate it that way. But in economics we normally express a second-order diff eq as a second-order diff eq. If the input is a second-order equation that needs to be reformulated as a system of first-order equations, then with the code required to implement such transformations, it is not obvious to me that having syntax for first-order equations is going to be worth the extra syntax in the language. Most of the complexity is going to be in the transformation which AFAICS is likely to be problem-specific, and therefore unavoidable by the modeler. OTOH, as PyScanPrev shows, the complexity of recursion can be hidden in a decorator, which the modeler can cargo-cult. Furthermore, in much of modern time-series econometrics, the order of the equation (number of lagged values to include) will be determined from the data and may differ from variable to variable and across equations, in which case you're effectively going to be carrying around a huge chunk of the data set as "state" (much of it unused) in each list element, which seems like a pretty clunky way to think about such problems computationally, however useful it may be in the theory of mathematical dynamics. I grant that 40 years in the field studying econometrics in terms of fixed data matrices has probably caused my synapses to clot -- and that's precisely why I'm asking *you* to explain to me how to beautify such code using the constructs you propose. I think the examples presented are already quite pretty without new syntax. As for economic theory, theory papers in economics don't include Python programs that I've seen. So having syntax for this feature in Python seems unlikely to improve presentation of economic theory. (I suppose that keeping "theoretical" in the quote was just an accident, but I could be missing something.)
The proposal isn't about quick and dirty code. The State-space example includes a general linear time-varying MIMO simulation implementation trying to keep the syntax as similar as possible to the control theory engineers are used to.
Fine, but I only questioned economics. I'm not a rocket scientist, I'll let the rocket scientists question that. If they don't, *I* certainly will concede you have a point in those other fields.
Also, my goal when I was looking for a scan syntax to solve the conditional toggling example was to make it cleaner.
>>> @enable_scan("p") ... def ltvss(A, B, C, D, u, x0=0): ... Ak, Bk, Ck, Dk = map(iter, [A, B, C, D]) ... u1, u2 = itertools.tee(u, 2) ... x = (next(Ak) * p + next(Bk) * uk for uk in prepend(x0, u1)) ... y = (next(Ck) * xk + next(Dk) * uk for xk, uk in zip(x, u2)) ... return y And this needs syntax now ... why?
And where's the link?
To what?
[...]. From what I remember, the conclusion reached was that there are too many degrees of freedom to be able to express reduction operations in a comprehension-like way that's any clearer
I didn't write this. Please keep your attributions straight.
I can't talk about a discussion I didn't read, it would be unfair, disrespectful.
You're perfectly willing to tell other people what to read, though. I realize times have changed since 1983, but as long as I've been on the 'net, it's always been considered polite to investigate previous discussions, and AFAIK it still is. Of course that has to be balanced against search costs, but now that you know that what you're looking for exists, the expected benefit of search has jumped, and of course you could ask David for hints to minimize the search costs. No?
Perhaps there are people who prefer masochist rituals instead of using "reduce", who knows? Who cares?
(1) There are. (2) Evidently you don't. (3) You should, because the leading (literally) person who prefers writing code himself to using "reduce" is Guido van Rossum.
This maillist isn't very inviting...
Why do you say that? Because people aren't falling over themselves to accept your proposal? You've been asked a simple question several times: why does this feature need to be implemented as syntax rather than a function? The only answers I've seen are the importance of the applications (not contested by anybody AFAICS), and your preference for syntax over a function. You have provided strong motivation for the feature, but not for an implementation via new syntax.
but I hope some of you at least try to read the rationale and the examples.
I did. They are very persuasive ... up to the point where you ask for syntax for something that appears (now that you've done it, kudos!) to be perfectly do-able with a function. It's not for me to say yes or no, but I can tell you that the outcomes of past discussions of this kind indicate that it will be unlikely that this proposal will be accepted without better justification for adding new syntax, preferably something that's impossible to implement performantly with a function. Or perhaps a "killer example" that persuades Guido or one of the other Most Senior Devs that this is way too cool to go without syntax. Steve
On 25 October 2016 at 15:02, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
I did. They are very persuasive ... up to the point where you ask for syntax for something that appears (now that you've done it, kudos!) to be perfectly do-able with a function.
This is an important point by the way. Ideas on this list typically start with proposals along the lines of "let's make X a builtin". Discussion tends to centre around whether X is sufficiently important to be built in - and the OP experiences a lot of pushback. That's natural - the bar for getting something into the stdlib is very high, for a builtin higher still, and for new syntax even higher. The fact that we *do* get proposals accepted is a testament to the high quality of some of the proposals. But it doesn't mean that everything warrants being accepted. The majority won't be. On the other hand, the *ideas* are really interesting and valuable. I'm certainly planning on looking at PyScanPrev when I get the chance. And the discussions can frequently make people rethink their beliefs. So people posting ideas here should expect pushback - and should be prepared to learn how to think about the wider context in which changes to Python need to exist. That pushback won't[1] be hostile or negative, although it can feel that way to newcomers. But if a poster is inclined to take challenges to their idea personally, and even more so if they respond negatively, things can get tense. So please don't :-) So, bringing this back on topic - Danilo, what is your justification for suggesting that this technique should be language syntax, as opposed to simply being a 3rd party module (which you've already written, which is great)? Do you know what sorts of things would be viewed as evidence in favour of promoting this to syntax, or can we help in clarifying the sort of evidence you'd need to collect? Are the relevant design guidelines (things like "there should be one obvious way to do it" that frequently get quoted around here without much explanation) clear to you, or do you have questions? Hopefully we can change your mind about how inviting you find us :-) Paul [1] With the occasional exception that we regret - we're only human, although we try to hold to high standards.
2016-10-25 12:29 GMT-02:00 Paul Moore <p.f.moore@gmail.com>:
On the other hand, the *ideas* are really interesting and valuable. I'm certainly planning on looking at PyScanPrev when I get the chance. And the discussions can frequently make people rethink their beliefs.
So people posting ideas here should expect pushback - and should be prepared to learn how to think about the wider context in which changes to Python need to exist. That pushback won't[1] be hostile or negative, although it can feel that way to newcomers. But if a poster is inclined to take challenges to their idea personally, and even more so if they respond negatively, things can get tense. So please don't :-)
So, bringing this back on topic - Danilo, what is your justification for suggesting that this technique should be language syntax, as opposed to simply being a 3rd party module (which you've already written, which is great)? Do you know what sorts of things would be viewed as evidence in favour of promoting this to syntax, or can we help in clarifying the sort of evidence you'd need to collect? Are the relevant design guidelines (things like "there should be one obvious way to do it" that frequently get quoted around here without much explanation) clear to you, or do you have questions?
Hopefully we can change your mind about how inviting you find us :-)
Thanks for your message, Paul. When I said the list isn't inviting, I'm not talking about everyone, but about quite few people that disturbs. I'm just trying to help improving the language. I'm waiting for your opinion about PyScanPrev. =) 2016-10-25 16:01 GMT-02:00 Brendan Barnwell <brenbarn@brenbarn.net>:
Recurrence relations are much more general than just "have access to the previous value". They may have access to any of the earlier values, and/or multiple earlier values. So if what we wanted was to able to use recurrence relations, your proposal would be insufficient.
Brendan, please see the PyScanPrev examples, mainly the Fibonacci and the State-space model examples. Recursion is enough to give you that. The proposal isn't about lag and windowing, but if you've got an idea to improve that, I'd like to know. 2016-10-25 15:55 GMT-02:00 Rob Cliffe <rob.cliffe@btinternet.com>:
On 24/10/2016 06:11, Danilo J. S. Bellini wrote:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
That makes sense for me, and seem simpler than:
from itertools import accumulate, chain list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k)) [1, 5, 10, 40, 120]
Well, if you want an instant reaction from someone skimming this thread: I looked at the first example and couldn't understand it. Then I looked at the second one, and could understand it (even though I may never have used "chain" or heard of "accumulate"). Obviously your mileage varies.
Rob, would it be the same if it was the other way around? I'm also somewhat familiar with map/filter, but I think you would need more time (and documentation) to understand what the second does if you hadn't seen the first. What do you think? 2016-10-25 23:36 GMT-02:00 David Mertz <mertz@gnosis.cx>:
After reading every post in the thread, I still don't understand the proposed new syntax really.
The proposal is to create a new syntax for recursive set/list/dict comprehensions and generator expressions. And that's all. It could be a new magic keyword as you said, but there are alternatives like the "from" example I gave. 2016-10-25 23:36 GMT-02:00 David Mertz <mertz@gnosis.cx>:
If you give up a fear of using `import` and stop arbitrarily converting a possibly infinite iterator to a concrete list
Ad hominem, and that's just offensive. Who said I "fear" an import? Who is "converting a possibly infinite iterator to a concrete list"?! You know that "it was a built-in, now it's in a module" is meaningful, and when it comes to what happened to "reduce", it's an argument for my proposal, not against it. Why should I fear? -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Tue, Oct 25, 2016 at 05:18:46AM -0200, Danilo J. S. Bellini wrote:
[...]. From what I remember, the conclusion reached was that there are too many degrees of freedom to be able to express reduction operations in a comprehension-like way that's any clearer
Danilo, if you are going to quote somebody, especially when you copy and paste their words out of a completely different email from the one you are replying to, please tell us who you are quoting. It is not polite to quote somebody without attribution and out of context.
I don't know if that's a conclusion from any other thread, but that's wrong. The only extra "freedom" required a way to access the previous output (or "accumulator", "memory", "state"... you can use the name you prefer, but they're all the same). How many parameters does itertools.scan have? And map/filter?
I do not know which email thread is being talked about here, but the conclusion is correct. In the most general case, you might want: - the current value of some running total or result; - the previous value of the running result; - the value of the running result before that ("previous previous"); - and so on; - more than one running calculation at the same time; - the current sequence value (for x in [1, 2, 3]); - the previous sequence value (previous value of x); - the sequence value before that ("previous previous x"); - etc. Recurrence relations are not all linear, and they don't always involve only the previous value. Fibonacci numbers need to track two previous running values: F[n] = F[n-1] + F[n-2] The Lagged Fibonacci generator is a pseudo-random number generator that uses a similar recurence, except instead of only needing to remember the previous two results, it remembers some arbitrary N previous results. E.g. we might say: S[n] = S[n - 4] + S[n - 9] and so we use the 4th-previous and 9th-previous result to generate the new one. Just having the current running result is not sufficient. [...]
Forget the word "reduce", some people here seem to have way too much taboo with that word, and I know there are people who would prefer a higher McCabe complexity just to avoid it. Perhaps there are people who prefer masochist rituals instead of using "reduce", who knows?
And perhaps there are people who think that using "reduce" and other functional idioms instead of good, clean, easy-to-understand, easy-to- debug imperative code is a "masochist ritual".
but I wrote a lot about the scan use cases and no one here seem to have read what I wrote, and the only reason that matters seem to be a kind of social status, not really "reason". I probably wrote way more reasons for that proposal than annotations could ever have.
I read your use-cases. I went to the Github page and looked at the examples and the documentation. I didn't comment because it doesn't matter whether there is one use-case or a million use-cases, the fundamental question still applies: why do we need ANOTHER way of solving these problems when there are already so many? More use-cases doesn't answer that question. A million weak use-cases is still weak. The obvious way to solve these use-cases is still an imperative for-loop. Python is not Haskell, it is not a functional programming language. It has some functional programming features, but it is not and never will be intended for arbitrarily complex code to be written in a pure functional style. Comprehensions are intentionally kept simple. They are not a substitute for all loops, only the easy 80% or 90%. As confirmed by Nick Coghlan, comprehensions are absolutely and fundamentally intended as syntactic sugar for ONE and ONLY one pattern. For example: [expr for x in seq for y in seq2 if cond] is sugar for: result = [] for x in seq: for y in seq2: if cond: result.append(expr) If you need to keep the previous sequence item, or a running total, or break out of the loop early, or use a while loop, you cannot use a comprehension. So I'm afraid that completely rules out your idea of having a running total or result in a comprehension. That simply isn't compatible with the design of comprehensions as ruled by Guido and the core devs. Could you change their mind? It's not impossible. If you have an compelling answer to the questions "why does this need to be part of comprehension syntax? why not use a for-loop, or itertools.accumulate?" then of course they will reconsider. But the barrier is very high. It is not a matter of more use-cases. We already have solutions to those use-cases. You have to explain why the existing solutions don't work. -- Steve
On 2016-10-25 00:18, Danilo J. S. Bellini wrote:
Well, see the itertools.accumulate examples yourself then, the ones at docs.python.org... We can start with something really simple like interest rates or uniform series, but... before arguing here, please convince other people to update the Wikipedia:
"Recurrence relations, especially linear recurrence relations, are used extensively in both theoretical and empirical economics." https://en.wikipedia.org/wiki/Recurrence_relation#Economics <https://en.wikipedia.org/wiki/Recurrence_relation#Economics>
The fact that that page is about recurrence relations supports the position that your proposed change is too specific. Recurrence relations are much more general than just "have access to the previous value". They may have access to any of the earlier values, and/or multiple earlier values. So if what we wanted was to able to use recurrence relations, your proposal would be insufficient. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
On 24/10/2016 06:11, Danilo J. S. Bellini wrote:
For example, a product:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
That makes sense for me, and seem simpler than:
from itertools import accumulate, chain list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k)) [1, 5, 10, 40, 120]
Well, if you want an instant reaction from someone skimming this thread: I looked at the first example and couldn't understand it. Then I looked at the second one, and could understand it (even though I may never have used "chain" or heard of "accumulate"). Obviously your mileage varies. Best wishes, Rob Cliffe
On Tue, Oct 25, 2016 at 1:55 PM, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
That makes sense for me, and seem simpler than:
from itertools import accumulate, chain list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k)) [1, 5, 10, 40, 120]
Well, if you want an instant reaction from someone skimming this thread: I looked at the first example and couldn't understand it.
After reading every post in the thread, I still don't understand the proposed new syntax really. How does 'prev' get bound in the loop? Is it a new magic keyword for "last accumulated element?" Does the binding in the "from" clause magically say that it should get rebound in the loop where it's no longer mentioned? Why is the word `from` used here when there's no obvious relation to other uses? The alternate version looks much better if you don't try so hard to make it look bad. The much more obvious spelling is: from operator import mul from itertools import accumulate, chain accumulate(chain([1], nums), mul) If you give up a fear of using `import` and stop arbitrarily converting a possibly infinite iterator to a concrete list, this form is extremely short and obvious. Moreover, in the example, it is extra strange that the multiplicative identity is added into the front of the iterator. This is exactly the same thing as simply spelling it: accumulate(nums, mul) Which is even shorter. It's feels very artificially contrived to insist that the initial element must live somewhere other than the iterator itself. But it would be easy enough to write a wrapper to massage an iterator for this special case: def prepend(item, it): return itertools.chain([item], it) Doesn't save any characters, but the name might emphasize the intent. Maybe this implementation forces the point even more: def prepend(item, it): yield item yield from it -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
My point is that although new syntax may be useful for simple cases, serious applications will worry about computational accuracy and likely will provide packages that handle general cases that nest these simple cases. Given they exist, most modelers will prefer using those packages to writing their own comprehensions. That may not apply in other fields, but AFAICS it does apply in economics. So if you can't handle the complex cases, the syntax is just cognitive overhead: TOOWTDI will be the high-quality general packages even where they're theoretically overkill.
[...] Sure, you *can* express a second-order
difference equation as two first-order equations, and perhaps you would actually calculate it that way. But in economics we normally express a second-order diff eq as a second-order diff eq. If the input is a second-order equation that needs to be reformulated as a system of first-order equations, then with the code required to implement such transformations, it is not obvious to me that having syntax for first-order equations is going to be worth the extra syntax in the language. Most of the complexity is going to be in the transformation which AFAICS is likely to be problem-specific, and therefore unavoidable by the modeler. OTOH, as PyScanPrev shows, the complexity of recursion can be hidden in a decorator, which the modeler can cargo-cult.
I'm not talking about "simple cases", but about quite general cases, general in the sense that it allows modeling time varying models (and the proposal is about recursion, whose computational power should be obvious). What would be more general than a linear MIMO (multiple input, multiple output) time varying state-space model? It's not so common to find some package that works with time varying models. Which ones do you know? Actually, the state-space equations (as used in a PyScanPrev example) is a "modern" way of doing general [linear time varying] control engineering using first-order equations and a n-dimensional state vector, as opposed to the "classical" difference equations and its Z transform, which are the core of my other package, AudioLazy: https://pypi.python.org/pypi/audiolazy You're right when you say that "serious applications will worry about computational accuracy", indeed that was a reason I wrote the gammatone filters in AudioLazy, the cascading filter model was a requirement to avoid getting just amplified floating point quantization garbage as a result (as used to happen when using a single call to scipy.signal.lfilter). A "rule of thumb" in signal processing is to use cascading/parallel biquads instead of higher order [recursive] filters directly due to the computational accuracy (but that doesn't apply to comb filters, for example). Anyway, that makes this talking about computational accuracy sound like an argument for my proposal here, not against it. I STRONGLY disagree with "will provide packages that handle general cases", it simply doesn't make sense. At best, a package would just mirror what happens in math: is there any general solution for general ODE? Packages handles specific things, while syntax deals with the general stuff: once you find something that a package can't do, you need to use something different, using only what the language has (or another package, if there's any available). There are packages that include converters of difference equations from/to state-space equations, but these couldn't be used when you're talking about time varying models. You can't use LTI tools for something that isn't time-invariant. 2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
Fine, but I only questioned economics. I'm not a rocket scientist, I'll let the rocket scientists question that. If they don't, *I* certainly will concede you have a point in those other fields.
Ok. and I'm from a digital signal processing background (EE and a MSCS on it), mainly for audio applications, though I work today in a fintech. But we're not here to give just ad hominem / ad verecundiam pseudo-arguments (I mean, we're not talking about me or you). The simplest examples I've ever seen in digital control theory were about economics, mainly because these models are useful and don't need any kind of "continuous time to discrete time" conversion, as the time is already discrete in the first place. I've created from scratch the leaking bucket example mainly to avoid copying an economics example from the Wilson J. Rugh "Linear Systems Theory" book in the PyScanPrev. If I understood you correctly, you're saying that economics is simpler than I thought and doesn't require anything beyond a basic LTI difference equation, and for that restrict scenario there are packages that already do the job. OTOH, if your point is that the scan higher order function / PyScanPrev and stuff alike aren't enough for properly modeling every economical model there is without some undesired adaptation for a subset of these models, that makes some sense. I'm just saying it's useful for a lot of things, including applications in economics. Does it need to be the Harry Potter wand to be "accepted"? 2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
This maillist isn't very inviting...
Why do you say that? Because people aren't falling over themselves to accept your proposal?
[...]
You're perfectly willing to tell other people what to read, though. I
realize times have changed since 1983, but as long as I've been on the 'net, it's always been considered polite to investigate previous discussions, and AFAIK it still is.
In the second part of this citation, you seem to realize the reason for what I said. Do you trust anyone who gives a value judgement about something he/she didn't read about without at least disclaiming about that lack of reading? It's not me who is telling other people to judge, but it's an idea I wrote about, and that's the reason this maillist exists, isn't it? I firmly believe "list comprehension" wouldn't be "accepted" today if it wasn't in Python syntax yet. Is there any previous discussion on this topic or something related to it? Where's the link? (I asked it before) It's always been considered polite to investigate before judging. 2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
Perhaps there are people who prefer masochist rituals instead of using "reduce", who knows? Who cares?
(1) There are. (2) Evidently you don't. (3) You should, because the leading (literally) person who prefers writing code himself to using "reduce" is Guido van Rossum.
Actually, that's an argument for this proposal, not against. I'm proposing a way to avoid itertools.accumulate and functools.reduce using an explicit syntax instead of the functions. Or does GvR prefer map/filter instead of list comprehensions? If you're right, then we should be consistent and propose the elimination of list comprehension and every syntax alike. The arguments you're giving against my proposal are the same. 2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
I don't know if that's a conclusion from any other thread, but that's wrong. The only extra "freedom" required a way to access the previous output (or "accumulator", "memory", "state"... you can use the name you prefer, but they're all the same). How many parameters does itertools.scan have? And map/filter?
I do not know which email thread is being talked about here, but the conclusion is correct. In the most general case, you might want:
- the current value of some running total or result; - the previous value of the running result; - the value of the running result before that ("previous previous"); - and so on; - more than one running calculation at the same time; - the current sequence value (for x in [1, 2, 3]); - the previous sequence value (previous value of x); - the sequence value before that ("previous previous x"); - etc.
Wrong. 1. The current value is already there, it's not something new. 2. The accumulator (previous result) is the only thing I'm talking about, anything beyond that is also beyond my new syntax proposal 3. The "previous previous" and so on are just lag/windowing, I'm not proposing a way to solve these, but my examples did that. Complaining about that is a proof that my examples weren't read. We can talk about a syntax to solve these as well but that's not what my proposal is about. That includes the "previous sequence value" history. 2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
Recurrence relations are not all linear, and they don't always involve only the previous value. Fibonacci numbers need to track two previous running values:
F[n] = F[n-1] + F[n-2]
Did you know Fibonacci is an example in PyScanPrev since long ago? And not all recurrence relations are time invariant, I gave an example of a time varying recurrence relation. Do you have any example of non-linear recurrence relation? Do you know how to linearize it, or do you have a proof that it can't be linearized as a time varying model? Why does it make any difference for this proposal? 2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
And perhaps there are people who think that using "reduce" and other functional idioms instead of good, clean, easy-to-understand, easy-to- debug imperative code is a "masochist ritual".
If you're talking about people who are misusing "reduce", then you're arguing for alternatives, and this proposal has an alternative as an obvious corollary. Or do you think everyone should prefer nested code instead of something more expression-oriented, though that's again something in PEP20? If you don't like "reduce", go ahead with the triple-for-sections. Good luck on that. But that's not the point of my proposal, I just found "reduce" is taboo, and that reminds me when I said all that matters here is social status. 2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
why do we need ANOTHER way of solving these problems when there are already so many? More use-cases doesn't answer that question
I already said: it's explicit and cleaner. Easier to memorize. Lower McCabe complexity. Less tokens (not characters). Less repeated tokens when compared with the triple-for-section comprehensions. And how about the AST tree height, wouldn't it have the same of a common list comprehension? But what would answer that question for plain common list comprehensions? Does the same complaining apply, or you think that attacking map/filter are the same as attacking list comprehensions, as you did here with my proposal and reduce? I think one should then propose that list comprehensions have to be removed, just the keep the consistency. Oh, it won't be due to backwards compatibility, I know. 2016-10-25 23:36 GMT-02:00 David Mertz <mertz@gnosis.cx>:
On Tue, Oct 25, 2016 at 1:55 PM, Rob Cliffe <rob.cliffe@btinternet.com> wrote:
[prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
That makes sense for me, and seem simpler than:
from itertools import accumulate, chain list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k)) [1, 5, 10, 40, 120]
Well, if you want an instant reaction from someone skimming this thread: I looked at the first example and couldn't understand it.
After reading every post in the thread, I still don't understand the proposed new syntax really.
How does 'prev' get bound in the loop? Is it a new magic keyword for "last accumulated element?" Does the binding in the "from" clause magically say that it should get rebound in the loop where it's no longer mentioned? Why is the word `from` used here when there's no obvious relation to other uses?
The alternate version looks much better if you don't try so hard to make it look bad. The much more obvious spelling is:
from operator import mul
from itertools import accumulate, chain
accumulate(chain([1], nums), mul)
If you give up a fear of using `import` and stop arbitrarily converting a possibly infinite iterator to a concrete list, this form is extremely short and obvious. Moreover, in the example, it is extra strange that the multiplicative identity is added into the front of the iterator. This is exactly the same thing as simply spelling it:
accumulate(nums, mul)
Which is even shorter. It's feels very artificially contrived to insist that the initial element must live somewhere other than the iterator itself. But it would be easy enough to write a wrapper to massage an iterator for this special case:
def prepend(item, it):
return itertools.chain([item], it)
Doesn't save any characters, but the name might emphasize the intent. Maybe this implementation forces the point even more:
def prepend(item, it):
yield item
yield from it
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Danilo J. S. Bellini writes:
I'm not talking about "simple cases", but about quite general cases, general in the sense that it allows modeling time varying models (and the proposal is about recursion, whose computational power should be obvious).
I think I admitted that in the text you snipped. If not, consider it stipulated here. My point is that for me functions like accumulate are sufficient without new syntax, and if you are going to do time- varying stuff in the kind of economics I studied and research, you are going to need to carry around so much state that the relative amount of simplicity you can gain with syntax is very small. As for recursion, the syntax you proposed doesn't say "recursion" to me, it says "let's add more complexity to a syntax that already is hard to fit on a line so we can turn a sequence into a series."
Anyway, that makes this talking about computational accuracy sound like an argument for my proposal here, not against it.
Not as I see it. My point is that functions like accumulate already get me as much of your proposal as I can see being useful in my own applications, and so I'd rather spend effort on the inherent complexity of accurate computation than on learning new syntax which as far as I can see buys me no extra simplicity. Consider the existing comprehension syntax. I use it all the time because it's very expressive, it "looks like" what I would write in a declarative mathematical definition, and the constructive alternative is not a function call, it's a loop, in many cases with a nested if test as well. However, when you try to express more than one loop with a comprehension, you end up with a counterintuitive ordering based on the current "this expands naturally to for loops and if conditionals according to this simple rule" syntax, along with a separation of the result expression from the loop where it's actually produced. It's not obvious to me that use of comprehension syntax beyond [ f(x) for x in y if g(x) ] is all that useful, but that very basic form is extremely useful by itself. If your proposal will help make more complex comprehensions easy to understand, then you've got something I'd like to have. But I don't yet see how it fixes that, and if not, I ain't gonna need it and I don't want to have to explain it to my students when they ain't gonna need it either.
Is there any previous discussion on this topic or something related to it? Where's the link? (I asked it before)
I don't have one off-hand (I really am interested only in the claimed uses in economics). As I wrote earlier, maybe David Mertz does.
Actually, that's an argument for this proposal, not against. I'm proposing a way to avoid itertools.accumulate and functools.reduce using an explicit syntax instead of the functions. Or does GvR prefer map/filter instead of list comprehensions?
itertools functions are widely used and generally well-thought-of. Avoiding accumulate is a non-goal AFAIK. AIUI, Guido does think that avoiding reduce is a goal, but I believe that's because he thinks the semantics are just plain hard to understand. If your syntax is in large part a substitute for reduce, I doubt he will like it more than reduce just because it's syntax. (I do not speak for Guido; I offer my observation in the hope it will help you prepare to present your idea to him in the most favorable light.)
If you're right, then we should be consistent and propose the elimination of list comprehension and every syntax alike. The arguments you're giving against my proposal are the same.
"A foolish consistency is the hobgoblin of small minds." If Emerson hadn't lived a century before Tim, I'm sure it would be in the Zen. That is, comprehensions are clearly useful, and I use them in almost every program I write. I still don't see when I'd ever strongly prefer the syntax you propose to the itertools we already have, so I don't care if it's a consistent extension. That doesn't mean it's not useful to you or others, it just means I'm not convinced -- and so I will not join the proponents. That's all.
2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
As for recursion, the syntax you proposed doesn't say "recursion" to me, it says "let's add more complexity to a syntax that already is hard to fit on a line so we can turn a sequence into a series."
1. The "fit on a line" is just another weird taboo: 1.1. An expression isn't a physical line. 1.2. Sub-expressions in an expression might be on other statements (e.g. assignments, other functions). 1.4. Several of my examples in PyScanPrev aren't oneliners, including one you pasted in an e-mail. 1.3. When you call itertools.accumulate, it's still an expression. 1.5. List comprehensions are expressions, including the triple-for-section comprehensions with repeated target variable names. 1.6. Being anti-expression because of an anti-oneliner bias sounds like moving towards an assembly language (using only atomic imperative commands) or something merely bureaucratic. 2. The itertools.accumulate function signature is broken: 2.1. Its arguments are reversed when compared to other higher order functions like map, filter and reduce. 2.2. It lacks a "start" parameter, requiring more complexity to include it (nesting a itertools.chain or a "prepend" function call). 3. It's not about "adding complexity", it's the other way around: 3.1. The proposal is about explicit recursion in a list comprehension (a way to access the previous output value/result, a.k.a. accumulator/state/memory). 3.2. Today you can only do (3.1) using either: 3.2.1. A triple-for-section list comprehension with repeated target names (as described in the PyScanPrev rationale section). 3.2.2. Functions for stack frame manipulation (like the example I gave in this maillist using stackfull). 3.2.3. Bytecode manipulation (the PyScanPrev approach), or something alike (e.g. AST manipulation). 3.2.4. Raw Python code pre-processing (e.g. by customizing some import hooks). Only 3.2.4 allows a new syntax. I'm not sure what you meant with "turn a sequence into a series", that sounds like the itertools.accumulate from Python 3.2, which didn't have a function as a parameter. 2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
Anyway, that makes this talking about computational accuracy sound like an argument for my proposal here, not against it.
Not as I see it. My point is that functions like accumulate already get me as much of your proposal as I can see being useful in my own applications, and so I'd rather spend effort on the inherent complexity of accurate computation than on learning new syntax which as far as I can see buys me no extra simplicity.
What do you mean by the word "accuracy"? IMHO, now you're not talking about accuracy anymore. It's an argument like "map/filter and generator expressions are the same" again. That's just an argument against generator expressions and list/set/dict comprehensions in general, as they "already get us as much of" map/filter. About the effort, do you really find the examples below with the new proposed syntax difficult to understand?
# Running product [prev * k for k in [5, 2, 4, 3] from prev = 1] [1, 5, 10, 40, 120]
# Accumulate (prefix sum / cumulative sum) [prev + k for k in [5, 2, 4, 3] from prev = 0] [0, 5, 7, 11, 14]
# Pairs to be used in the next example (nothing new here) pairs = [(a, b) for a in [0, 1, 2, 3] for b in [-2, 2] if b != a] pairs [(0, -2), (0,2), (1, -2), (1, 2), (2, -2), (3, -2), (3, 2)]
# The recursive list comprehension with the new syntax [b + prev * a for a, b in pairs from prev = 0] [0, -2, 2, 0, 2, 2, 4, 14]
# The same with itertools.accumulate (for comparison) from itertools import accumulate, chain list(accumulate(chain([0], pairs), ... lambda prev, pair: pair[1] + prev * pair[0])) [0, -2, 2, 0, 2, 2, 4, 14]
# The same in a single expression using the new syntax [b + prev * a for a in [0, 1, 2, 3] ... for b in [-2, 2] ... if b != a ... from prev = 0] [0, -2, 2, 0, 2, 2, 4, 14]
Everything needs some effort to be understood, some effort to be memorized, some effort to be maintained, etc.. But Rob Cliffe could understand the running product when he had read this thread and disclaimed that he never heard about accumulate. For someone who doesn't know what accumulate/scan/fold/reduce is, I think the proposed syntax is more descriptive/explicit/instructional/didactic/maintainable. Please read the "Rationale" section and the "Comparison" example in PyScanPrev. Also, please keep in mind that even with bytecode manipulation, PyScanPrev is still limited to valid Python syntax, and the syntax proposed here is different and more general (e.g. the last example in this e-mail can't be done in PyScanPrev as a single list comprehension). 2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
Consider the existing comprehension syntax. I use it all the time because it's very expressive, it "looks like" what I would write in a declarative mathematical definition, and the constructive alternative is not a function call, it's a loop, in many cases with a nested if test as well. However, when you try to express more than one loop with a comprehension, you end up with a counterintuitive ordering based on the current "this expands naturally to for loops and if conditionals according to this simple rule" syntax, along with a separation of the result expression from the loop where it's actually produced. It's not obvious to me that use of comprehension syntax beyond [ f(x) for x in y if g(x) ] is all that useful, but that very basic form is extremely useful by itself.
And what I'm proposing is a very expressive syntax that "looks like" what I would write in a declarative mathematical definition. The state-space example is really explicit on this. Also, it's something VERY useful, I've written several examples and I know they aren't exhaustive. Counterintuitive ordering in list comprehensions? More than one loop? You don't like the multiple for sections cartesian product in a list comprehension, as it seems. That's not the point in this proposal, anyway. In your example, we can use either the generator expression (f(x) for x in y if g(x)) or the point-free map(f, filter(g, y)). Why do you prefer the first? You said "it's not a function call, it's a loop", but at the end you wrote "f(x)" and "g(x)". Maybe that's why you didn't see the recursion: it's tail-recursive / stackless. Say we have: [f(p, x) for x in y from p = z] The result for y = [y0, y1, y2, ...] would be: [z, f(z, y0), f(f(z, y0), y1), f(f(f(z, y0), y1), y2), ...] 2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
If your proposal will help make more complex comprehensions easy to understand, then you've got something I'd like to have. But I don't yet see how it fixes that, and if not, I ain't gonna need it and I don't want to have to explain it to my students when they ain't gonna need it either.
Nice to know. The proposed:
[f(p, x) for x in y from p = z]
Should return the same from this, if one insists on using a list comprehension today without the new syntax:
(lambda start: [start] + [p for p in [start] ... for x in y ... for p in [f(p, x)]] ... )(z)
The proposed comprehension is easier/simpler than that in every criterion I'm aware of. 2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
itertools functions are widely used and generally well-thought-of. Avoiding accumulate is a non-goal AFAIK. AIUI, Guido does think that avoiding reduce is a goal, but I believe that's because he thinks the semantics are just plain hard to understand. If your syntax is in large part a substitute for reduce, I doubt he will like it more than reduce just because it's syntax. (I do not speak for Guido; I offer my observation in the hope it will help you prepare to present your idea to him in the most favorable light.)
I doubt Guido is against solving problems that requires a fold to be solved. AFAIK he's against the "reduce" higher order function itself, not against what it does. Mixing that is like acting against list comprehensions just because Guido doesn't seem to like the map/filter higher order functions. Guido either prefers what exists or an alternative, I'm pretty sure he wouldn't ever say "solving this kind of problem is forbidden in Python". And that's again an argument for my proposal, as getting a fold from a scan is obvious: it's just the last result. Using reduce, let's get the last value after scanning through the "pairs" I defined in a previous example in this e-mail:
# The last value with reduce (Python 2 only, for comparison) reduce(lambda prev, (a, b): b + prev * a, pairs, 0) 14 # ... and with functools.reduce (Python 2 and 3) from functools import reduce reduce(lambda prev, pair: pair[1] + prev * pair[0], pairs, 0) 14
Actually, using a [:-1] slice or iterating until we get the last element in the previous [0, -2, 2, 0, 2, 2, 4, 14] result would be way more explicit (and wouldn't require the "reduce" function itself). I've proposed a "last" function to do that, but that's another point, not so required in the sense that one can write it as a function elsewhere. In Python 2.7, it's worth noting that reduce doesn't require the import, and lambdas are more powerful as they allow the "prev, (a, b)" signature. This "unpacking" syntax was lost in Python 3 but not when we're dealing with "for" loops/comprehensions. Something similar could be said about itertools.accumulate. 2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp>:
That is, comprehensions are clearly useful, and I use them in almost every program I write. I still don't see when I'd ever strongly prefer the syntax you propose to the itertools we already have, so I don't care if it's a consistent extension. That doesn't mean it's not useful to you or others, it just means I'm not convinced -- and so I will not join the proponents. That's all.
That's not really about consistency, it sounds more like a "Don't touch!". There are 2 consistencies we're talking about here: API consistency and argumentation consistency. APIs can have backward compatibility issues and legacy code, which are sources of inconsistency. OTOH, inconsistency in argumentation is either a mistake or a deceitful behavior. For me, the idea I'm proposing is clearly useful. I wanted to use it last friday to create a function that auto-indents some text based on some simple rules, and the resulting indentation is the "accumulation" (scan) of indent/dedent markers. But I couldn't even use itertools.accumulate, because that project requires Python 2.7. I need a really strong reason to migrate to Python 3 a proprietary legacy code that isn't mine, and any "anti-functional" / "anti-expression" bias is a strong reason not to do so. Also, it's worth noting that the "functional" package in PyPI doesn't work on Python 3. Stuff like the f-strings are really nice, I can't count how many times I wrote some_string_literal.format(**locals()), but replacing all lambda signatures by cryptic indexing or naming them elsewhere still doesn't worth it (Would you prefer to replace every single list comprehension body by a named function defined with "def"? If not, why others should?). I'm proposing something that would make Python 3 more inviting, and not only for people coming from a functional programming background. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Danilo J. S. Bellini writes:
About the effort, do you really find the examples below with the new proposed syntax difficult to understand?
No. I just don't see how they would become tools I would use. My main interest here was in your claim to have economic applications, but the examples you give don't seem to offer big wins for the kind of calculations I, my students, or my colleagues do. Perhaps you will have better luck interesting/persuading others.
2016-11-06 18:00 GMT-02:00 Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp>:
Danilo J. S. Bellini writes:
About the effort, do you really find the examples below with the new proposed syntax difficult to understand?
No. I just don't see how they would become tools I would use. My main interest here was in your claim to have economic applications, but the examples you give don't seem to offer big wins for the kind of calculations I, my students, or my colleagues do. Perhaps you will have better luck interesting/persuading others.
If you want something simple, the itertools.accumulate examples from docs.python.org include a simple "loan amortization" example:
# Amortize a 5% loan of 1000 with 4 annual payments of 90 cashflows = [1000, -90, -90, -90, -90] list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt)) [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]
# With the proposed syntax payments = [90, 90, 90, 90] [bal * 1.05 - pmt for pmt in payments from bal = 1000] [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]
From the Wilson J. Rugh "Linear Systems Theory" book, chapter 20 "Discrete Time State Equations", p. 384-385 (the very first example on the topic):
""" A simple, classical model in economics for national income y(k) in year k describes y(k) in terms of consumer expenditure c(k), private investment i(k), and government expenditure g(k) according to: y(k) = c(k) + i(k) + g(k) These quantities are interrelated by the following assumptions. First, consumer expenditure in year k+1 is proportional to the national income in year k, c(k+1) = α·y(k) where the constant α is called, impressively enough, the marginal propensity to consume. Second, the private investment in year k+1 is proportional to the increase in consumer expenditure from year k to year k+1, i(k+1) = β·[c(k+1) - c(k)] where the constant β is a growth coefficient. Typically 0 < α < 1 and β > 0.
From these assumptions we can write the two scalar difference equations
c(k+1) = α·c(k) + α·i(k) + α·g(k) i(k+1) = (β·α-β)·c(k) + β·α·i(k) + β·α·g(k) Defining state variables as x₁(k) = c(k) and x₂(k) = i(k), the output as y(k), and the input as g(k), we obtain the linear state equation # ⎡ α α ⎤ ⎡ α ⎤ # x(k+1) = ⎢ ⎥·x(k) + ⎢ ⎥·g(k) # ⎣β·(α-1) β·α⎦ ⎣β·α⎦ # # y(k) = [1 1]·x(k) + g(k) Numbering the years by k = 0, 1, ..., the initial state is provided by c(0) and i(0). """ You can use my "ltiss" or "ltvss" (if alpha/beta are time varying) functions from the PyScanPrev state-space example to simulate that, or some dedicated function. The linear time varying version with the proposed syntax would be (assuming alpha, beta and g are sequences like lists/tuples):
from numpy import mat def first_digital_linear_system_example_in_book(alpha, beta, c0, i0, g): ... A = (mat([[a, a ], ... [b*(a-1), b*a]]) for a, b in zip(alpha, beta)) ... B = (mat([[a ], ... [b*a]]) for a, b in zip(alpha, beta)) ... x0 = mat([[c0], ... [i0]]) ... x = (Ak*xk + Bk*gk for Ak, Bk, gk in zip(A, B, g) from xk = x0) ... return [xk.sum() + gk for xk, gk in zip(x, g)]
If A and B were constants, it's simpler, as the scan line would be: x = (A*xk + B*gk for gk in g from xk = x0) -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
- So, IIUC, for recursive list comprehensions - "prev" = x_(n-1) - there is a need to define an initial value - chain([1000], [...]) - sometimes, we actually need window function - __[0] = x_(n-1) - __[1] = x_(n-2) # this - __[-1] = x_(n-2) # or this - this can be accomplished with dequeue - __= dequeue([1000], maxlen) - for recursive list comprehensions, we'd want to bind e.g. __ to a dequeue [f(__[0], x) for x in y with __ = dequeue((1000,), 1)] But the proposed syntax differs from this interpretation: - "from bal = 1000" # ~= with prev = dequeue((1000,), 1)[-1] (Recursive) fibonacci would then require a dequeue (..., 2) Other than brevity, is there any advantage to list comprehensions over a for loop? - IIRC, reduce() and fold() can avoid unnecessary variable binding, but require lower-level debugging. A recursive list comprehension syntax would be cool. Is there a better variable name than '__'? On Sunday, November 6, 2016, Danilo J. S. Bellini <danilo.bellini@gmail.com> wrote:
2016-11-06 18:00 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw@u. tsukuba.ac.jp <javascript:_e(%7B%7D,'cvml','turnbull.stephen.fw@u.tsukuba.ac.jp');>>:
Danilo J. S. Bellini writes:
About the effort, do you really find the examples below with the new proposed syntax difficult to understand?
No. I just don't see how they would become tools I would use. My main interest here was in your claim to have economic applications, but the examples you give don't seem to offer big wins for the kind of calculations I, my students, or my colleagues do. Perhaps you will have better luck interesting/persuading others.
If you want something simple, the itertools.accumulate examples from docs.python.org include a simple "loan amortization" example:
# Amortize a 5% loan of 1000 with 4 annual payments of 90 cashflows = [1000, -90, -90, -90, -90] list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt)) [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]
# With the proposed syntax payments = [90, 90, 90, 90] [bal * 1.05 - pmt for pmt in payments from bal = 1000] [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001]
From the Wilson J. Rugh "Linear Systems Theory" book, chapter 20 "Discrete Time State Equations", p. 384-385 (the very first example on the topic):
""" A simple, classical model in economics for national income y(k) in year k describes y(k) in terms of consumer expenditure c(k), private investment i(k), and government expenditure g(k) according to:
y(k) = c(k) + i(k) + g(k)
These quantities are interrelated by the following assumptions. First, consumer expenditure in year k+1 is proportional to the national income in year k,
c(k+1) = α·y(k)
where the constant α is called, impressively enough, the marginal propensity to consume. Second, the private investment in year k+1 is proportional to the increase in consumer expenditure from year k to year k+1,
i(k+1) = β·[c(k+1) - c(k)]
where the constant β is a growth coefficient. Typically 0 < α < 1 and β > 0.
From these assumptions we can write the two scalar difference equations
c(k+1) = α·c(k) + α·i(k) + α·g(k) i(k+1) = (β·α-β)·c(k) + β·α·i(k) + β·α·g(k)
Defining state variables as x₁(k) = c(k) and x₂(k) = i(k), the output as y(k), and the input as g(k), we obtain the linear state equation
# ⎡ α α ⎤ ⎡ α ⎤ # x(k+1) = ⎢ ⎥·x(k) + ⎢ ⎥·g(k) # ⎣β·(α-1) β·α⎦ ⎣β·α⎦ # # y(k) = [1 1]·x(k) + g(k)
Numbering the years by k = 0, 1, ..., the initial state is provided by c(0) and i(0). """
You can use my "ltiss" or "ltvss" (if alpha/beta are time varying) functions from the PyScanPrev state-space example to simulate that, or some dedicated function. The linear time varying version with the proposed syntax would be (assuming alpha, beta and g are sequences like lists/tuples):
from numpy import mat def first_digital_linear_system_example_in_book(alpha, beta, c0, i0, g): ... A = (mat([[a, a ], ... [b*(a-1), b*a]]) for a, b in zip(alpha, beta)) ... B = (mat([[a ], ... [b*a]]) for a, b in zip(alpha, beta)) ... x0 = mat([[c0], ... [i0]]) ... x = (Ak*xk + Bk*gk for Ak, Bk, gk in zip(A, B, g) from xk = x0) ... return [xk.sum() + gk for xk, gk in zip(x, g)]
If A and B were constants, it's simpler, as the scan line would be:
x = (A*xk + B*gk for gk in g from xk = x0)
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
2016-11-06 23:27 GMT-02:00 Wes Turner <wes.turner@gmail.com>:
- So, IIUC, for recursive list comprehensions - "prev" = x_(n-1) - there is a need to define an initial value - chain([1000], [...]) - sometimes, we actually need window function - __[0] = x_(n-1) - __[1] = x_(n-2) # this - __[-1] = x_(n-2) # or this - this can be accomplished with dequeue - __= dequeue([1000], maxlen) - for recursive list comprehensions, we'd want to bind e.g. __ to a dequeue
[f(__[0], x) for x in y with __ = dequeue((1000,), 1)]
If I understood correctly, that's an alternative to get a general recursive list comprehension with a syntax like: [f(hist, x) for x in y with hist = deque([start_values], size)] You're trying to solve the output lag/window problem using a circular queue with random/indexed access to its values (a collections.deque instance). You're using "with" instead of "from" to distinguish it from my first proposal. That's not a scan anymore, but something more general. Some information can be removed from that syntax, for example the size can be defined to be the starting iterable/memory/history data size, and the deque can be something internal. Also, using the negative indices would be more explicit as hist[-1] would be the previous iteration result, hist[-2] would be its former result and so on. The syntax would be: [func(hist, target) for target in iterable with hist = start_iterable] i.e., this idea is about a new "with hist = start_iterable" at the end (or " from" instead of "with"). The resulting list size would be len(list(start_iterable)) + len(list(iterable)). As a generator instead of a list, that can be implemented as this "windowed scan" generator function:
import collections def wscan(func, iterable, start_iterable): ... pre_hist = [] ... for item in start_iterable: ... yield item ... pre_hist.append(item) ... hist = collections.deque(pre_hist, len(pre_hist)) ... for target in iterable: ... item = func(hist, target) ... yield item ... hist.append(item)
The Fibonacci example would be written as:
list(wscan(lambda fibs, unused: fibs[-1] + fibs[-2], range(10), [0, 1])) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
With the "windowed scan" syntax proposal, it would become:
[fibs[-1] + fibs[-2] for unused in range(10) with fibs = [0, 1]] [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
Or:
[fibs[-1] + fibs[-2] for unused in range(10) from fibs = [0, 1]] [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
I think I understand now. Still, for the more general case, I think it could be useful to be able to specify a max window size for the the dequeue (or indexable list-like structure); though defaulting to None and storing all values may be additionally useful. On Monday, November 14, 2016, Danilo J. S. Bellini <danilo.bellini@gmail.com> wrote:
2016-11-06 23:27 GMT-02:00 Wes Turner <wes.turner@gmail.com <javascript:_e(%7B%7D,'cvml','wes.turner@gmail.com');>>:
- So, IIUC, for recursive list comprehensions - "prev" = x_(n-1) - there is a need to define an initial value - chain([1000], [...]) - sometimes, we actually need window function - __[0] = x_(n-1) - __[1] = x_(n-2) # this - __[-1] = x_(n-2) # or this - this can be accomplished with dequeue - __= dequeue([1000], maxlen) - for recursive list comprehensions, we'd want to bind e.g. __ to a dequeue
[f(__[0], x) for x in y with __ = dequeue((1000,), 1)]
If I understood correctly, that's an alternative to get a general recursive list comprehension with a syntax like:
[f(hist, x) for x in y with hist = deque([start_values], size)]
You're trying to solve the output lag/window problem using a circular queue with random/indexed access to its values (a collections.deque instance). You're using "with" instead of "from" to distinguish it from my first proposal.
That's not a scan anymore, but something more general. Some information can be removed from that syntax, for example the size can be defined to be the starting iterable/memory/history data size, and the deque can be something internal. Also, using the negative indices would be more explicit as hist[-1] would be the previous iteration result, hist[-2] would be its former result and so on. The syntax would be:
[func(hist, target) for target in iterable with hist = start_iterable]
i.e., this idea is about a new "with hist = start_iterable" at the end (or "from" instead of "with"). The resulting list size would be len(list(start_iterable)) + len(list(iterable)). As a generator instead of a list, that can be implemented as this "windowed scan" generator function:
import collections def wscan(func, iterable, start_iterable): ... pre_hist = [] ... for item in start_iterable: ... yield item ... pre_hist.append(item) ... hist = collections.deque(pre_hist, len(pre_hist)) ... for target in iterable: ... item = func(hist, target) ... yield item ... hist.append(item)
The Fibonacci example would be written as:
list(wscan(lambda fibs, unused: fibs[-1] + fibs[-2], range(10), [0, 1])) [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
With the "windowed scan" syntax proposal, it would become:
[fibs[-1] + fibs[-2] for unused in range(10) with fibs = [0, 1]] [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
Or:
[fibs[-1] + fibs[-2] for unused in range(10) from fibs = [0, 1]] [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Sun, Nov 06, 2016 at 04:46:42PM -0200, Danilo J. S. Bellini wrote:
1.2. Sub-expressions in an expression might be on other statements (e.g. assignments, other functions).
Not in Python it can't be. Assignment is not an expression, you cannot say (x = 2) + 1.
2. The itertools.accumulate function signature is broken: 2.1. Its arguments are reversed when compared to other higher order functions like map, filter and reduce.
That's only "broken" if you think that we should be able to swap accumulate for map, filter and reduce. But that's not the case: accumulate is NOT broken because the API is not intended to be the same as map, filter and reduce. The API for accumulate is that the iterable is mandatory, but the function argument is *not*.
2.2. It lacks a "start" parameter, requiring more complexity to include it (nesting a itertools.chain or a "prepend" function call).
Then just create your own wrapper: def accum(func, iterable, start=None): if start is not None: iterable = itertools.chain([start], iterable) return itertools.accumulate(iterable, func)
3. It's not about "adding complexity", it's the other way around:
No, I'm sorry, it does add complexity.
3.1. The proposal is about explicit recursion in a list comprehension (a way to access the previous output value/result, a.k.a. accumulator/state/memory).
List comprehensions are not intended for these sorts of complex calculations. Regular for-loops are easy to use and can be as general as you like.
I'm not sure what you meant with "turn a sequence into a series",
I think Stephen is referring to the mathematical concept of sequences and series. A sequence is an ordered set of numbers obeying some rule, e.g.: [1, 2, 3, 4, 5] while a series is the partial sums found by adding each term to the previous sum: [1, 3, 6, 10, 15] -- Steve
2016-11-06 23:55 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
On Sun, Nov 06, 2016 at 04:46:42PM -0200, Danilo J. S. Bellini wrote:
1.2. Sub-expressions in an expression might be on other statements (e.g. assignments, other functions).
Not in Python it can't be. Assignment is not an expression, you cannot say (x = 2) + 1.
O.o ... how is that (x = 2) + 1 related to what I wrote? Say you have "d = a + b + c", you can write it as "h = a + b" then "d = h + c"... The expression "a + b + c" and the new "h + c" are equivalent because the sub-expression "a + b" was assigned to "h" elsewhere (another statement). 2016-11-06 23:55 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
2. The itertools.accumulate function signature is broken: 2.1. Its arguments are reversed when compared to other higher order functions like map, filter and reduce.
That's only "broken" if you think that we should be able to swap accumulate for map, filter and reduce. But that's not the case: accumulate is NOT broken because the API is not intended to be the same as map, filter and reduce. The API for accumulate is that the iterable is mandatory, but the function argument is *not*.
I'd say not even on map and filter requires a function, as "None" means "lambda x: x" on them, strangely enough. All of them are higher order functions in the standard Python, and you can include other functions like itertools.takewhile and itertools.dropwhile to this comparison. The accumulate signature is simply broken/different/surprising in that context. 2016-11-06 23:55 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
3. It's not about "adding complexity", it's the other way around:
No, I'm sorry, it does add complexity.
Compared to the alternatives, it's the other way around: my proposal removes complexity from the resulting code that requires a scan. Unless you're saying so because it's a proposal to change CPython, and as CPython itself would get a new feature, it would "be more complex"... I agree: any change would either add complexity somewhere or be backwards incompatible. Which one is better? Why this mail list exists? 2016-11-06 23:55 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
List comprehensions are not intended for these sorts of complex calculations.
I know list comprehensions aren't yet intended for scan, otherwise why would I propose this? Triple-for-sections list comprehensions are complicated, and it's actually surprising that Python allows using the same target variable name twice on them. But the calculation itself and its declarative description aren't complex. Even you wrote such a declarative description in your e-mail, when explaining what "series" are. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Mon, Nov 07, 2016 at 02:21:04AM -0200, Danilo J. S. Bellini wrote:
2016-11-06 23:55 GMT-02:00 Steven D'Aprano <steve@pearwood.info>:
On Sun, Nov 06, 2016 at 04:46:42PM -0200, Danilo J. S. Bellini wrote:
1.2. Sub-expressions in an expression might be on other statements (e.g. assignments, other functions).
Not in Python it can't be. Assignment is not an expression, you cannot say (x = 2) + 1.
O.o ... how is that (x = 2) + 1 related to what I wrote?
I misread your comment as "Sub-expressions in an expression might be other statements". Sorry for the misunderstanding. -- Steve
+1, especially given that reduce is not something you use very often. You loop and you filter everyday, but you definitely don't need the cumulative result of a sequence everyday. Python already have good any, all, sum and string concatenation stories so most of the FP usual suspect are taken care of. And remember that even when we do have something missing that we use often, it's not always enough to convince Guido to change the language for it. E.G, we have an old and recurrent debate about adding a keyword to assign a temporary calculation in comprehension list so that: [x[0].upper() for x in stuff() if x[0].upper()] can become: [x[0].upper() as word for x in stuff() if word] (and many other variants) All went to a dead end. So if you want to add the accumulate feature to the syntax, you better have a VERY GOOD reason. Le 23/10/2016 à 17:59, Steven D'Aprano a écrit :
On Sun, Oct 23, 2016 at 12:57:10PM -0200, Danilo J. S. Bellini wrote:
The idea is to let generator expressions and list/set comprehensions have a clean syntax to access its last output. That would allow them to be an alternative syntax to the scan higher-order function [1] (today implemented in the itertools.accumulate function), which leads to an alternative way to write a fold/reduce. It would be nice to have something like:
[cut suggested syntax]
instead of a reduce:
[cut four existing ways to solve the problem]
Why do we need a FIFTH way to solve this problem? What you are describing is *exactly* the use case for a reduce or fold function. Why add special magic syntax to make comprehensions do even more?
Not everything needs to be a one liner. It's okay to import reduce to do a reduce. Its okay to write an actual for-loop.
Actually, I already wrote a solution for something similar to that: PyScanPrev [2].
Ah, that makes SIX existing solutions. Do we need a seventh?
There was a discussion about this a while ago. From what I remember, the conclusion reached was that there are too many degrees of freedom to be able to express reduction operations in a comprehension-like way that's any clearer than just using reduce() or writing out the appropriate loops. -- Greg
participants (11)
-
Brendan Barnwell
-
Chris Angelico
-
Danilo J. S. Bellini
-
David Mertz
-
Greg Ewing
-
Michel Desmoulin
-
Paul Moore
-
Rob Cliffe
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Wes Turner