Proposal: A Reduce-Map Comprehension and a "last" builtin
Dear all, In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary. I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension. I propose a new "Reduce-Map" comprehension that allows us to write: signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Instead of: def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05)) I've created a complete proposal at: https://github.com/petered/peps/blob/master/pep-9999.rst , (and a pull-request <https://github.com/python/peps/pull/609>) and I'd be interested to hear what people think of this idea. Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax. - Peter
On 05/04/18 17:52, Peter O'Connor wrote:
Dear all,
In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary.
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Ew. This looks magic (and indeed is magic) and uses single equals inside the expression (inviting "=" vs "==" gumbies). I think you are trying to do too much in one go, and something like this is complex enough that it shouldn't be in a comprehension in the first place.
Instead of:
def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05))
Aside from unnecessarily being a generator, this reads better to me! -- Rhodri James *-* Kynesim Ltd
On 2018 Apr 5 , at 12:52 p, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Dear all,
In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary.
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write: signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Instead of: def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05)) I've created a complete proposal at: https://github.com/petered/peps/blob/master/pep-9999.rst , (and a pull-request) and I'd be interested to hear what people think of this idea.
Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax.
See itertools.accumulate, comparing the rough implementation in the docs to your exponential_moving_average function: signal = [math.sin(i*0.01) + random.normalvariate(0,0.1) for i in range(1000)] dev compute_avg(avg, x): return (1 - decay)*avg + decay * x smooth_signal = accumulate([initial_average] + signal, compute_avg) -- Clint
Ah, that's nice, I didn't know that itertools.accumulate now has an optional "func" parameter. Although to get the exact same behaviour (output the same length as input) you'd actually have to do: smooth_signal = itertools.islice(itertools.accumulate([initial_average] + signal, compute_avg), 1, None) And you'd also have to use iterools.chain to concatenate the initial_average to the rest if "signal" were a generator instead of a list, so the fully general version would be: smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None) I find this a bit awkward, and maintain that it would be nice to have this as a built-in language construct to do this natively. You have to admit: smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Is a lot cleaner and more intuitive than: dev compute_avg(avg, x): return (1 - decay)*avg + decay * x smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None) Moreover, if added with the "last" builtin proposed in the link, it could also kill the need for reduce, as you could instead use: last_smooth_signal = last(average = (1-decay)*average + decay*x for x in signal from average=0.) On Thu, Apr 5, 2018 at 1:48 PM, Clint Hepner <clint.hepner@gmail.com> wrote:
On 2018 Apr 5 , at 12:52 p, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Dear all,
In Python, I often find myself building lists where each element depends on the last. This generally means making a for-loop, create an initial list, and appending to it in the loop, or creating a generator-function. Both of these feel more verbose than necessary.
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write: signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.] Instead of: def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05)) I've created a complete proposal at: https://github.com/petered/ peps/blob/master/pep-9999.rst , (and a pull-request) and I'd be interested to hear what people think of this idea.
Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax.
See itertools.accumulate, comparing the rough implementation in the docs to your exponential_moving_average function:
signal = [math.sin(i*0.01) + random.normalvariate(0,0.1) for i in range(1000)]
dev compute_avg(avg, x): return (1 - decay)*avg + decay * x
smooth_signal = accumulate([initial_average] + signal, compute_avg)
-- Clint
On 5 April 2018 at 22:26, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I find this a bit awkward, and maintain that it would be nice to have this as a built-in language construct to do this natively. You have to admit:
smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Is a lot cleaner and more intuitive than:
dev compute_avg(avg, x): return (1 - decay)*avg + decay * x
smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None)
Not really, I don't... In fact, factoring out compute_avg() is the first step I'd take in converting the proposed syntax into something I'd find readable and maintainable. (It's worth remembering that when you understand the subject of the code very well, it's a lot easier to follow complex constructs, than when you're less familiar with it - and the person who's unfamiliar with it could easily be you in a few months). The string of itertools functions are *not* readable, but I'd fix that by expanding them into an explicit loop: smooth_signal = [] average = 0 for x in signal: average = compute_avg(average, x) smooth_signal.append(average) If I have that wrong, it's because I misread *both* the itertools calls *and* the proposed syntax. But I doubt anyone would claim that it's possible to misunderstand the explicit loop.
Moreover, if added with the "last" builtin proposed in the link, it could also kill the need for reduce, as you could instead use:
last_smooth_signal = last(average = (1-decay)*average + decay*x for x in signal from average=0.)
last_smooth_signal = 0 for x in signal: last_smooth_signal = compute_avg(last_smooth_signal, x) or functools.reduce(compute_avg, signal, 0), if you prefer reduce() - I'm not sure I do. Sorry, this example has pretty much confirmed for me that an explicit loop is *far* more readable. Paul.
Well, whether you factor out the loop-function is a separate issue. Lets say we do: smooth_signal = [average = compute_avg(average, x) for x in signal from average=0] Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love? On Thu, Apr 5, 2018 at 5:55 PM, Paul Moore <p.f.moore@gmail.com> wrote:
I find this a bit awkward, and maintain that it would be nice to have
On 5 April 2018 at 22:26, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote: this
as a built-in language construct to do this natively. You have to admit:
smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Is a lot cleaner and more intuitive than:
dev compute_avg(avg, x): return (1 - decay)*avg + decay * x
smooth_signal = itertools.islice(itertools.accumulate(itertools.chain([initial_average], signal), compute_avg), 1, None)
Not really, I don't... In fact, factoring out compute_avg() is the first step I'd take in converting the proposed syntax into something I'd find readable and maintainable. (It's worth remembering that when you understand the subject of the code very well, it's a lot easier to follow complex constructs, than when you're less familiar with it - and the person who's unfamiliar with it could easily be you in a few months).
The string of itertools functions are *not* readable, but I'd fix that by expanding them into an explicit loop:
smooth_signal = [] average = 0 for x in signal: average = compute_avg(average, x) smooth_signal.append(average)
If I have that wrong, it's because I misread *both* the itertools calls *and* the proposed syntax. But I doubt anyone would claim that it's possible to misunderstand the explicit loop.
Moreover, if added with the "last" builtin proposed in the link, it could also kill the need for reduce, as you could instead use:
last_smooth_signal = last(average = (1-decay)*average + decay*x for x in signal from average=0.)
last_smooth_signal = 0 for x in signal: last_smooth_signal = compute_avg(last_smooth_signal, x)
or functools.reduce(compute_avg, signal, 0), if you prefer reduce() - I'm not sure I do.
Sorry, this example has pretty much confirmed for me that an explicit loop is *far* more readable.
Paul.
On Thu, Apr 05, 2018 at 06:24:25PM -0400, Peter O'Connor wrote:
Well, whether you factor out the loop-function is a separate issue. Lets say we do:
smooth_signal = [average = compute_avg(average, x) for x in signal from average=0]
Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love?
Be careful about asking questions which you think are rhetorical but aren't. I can think of at least half a dozen objections to this: - I'd have no idea what it means without the context of reading this thread. - That you call it "MapReduce" while apparently doing something different from what other people call MapReduce: https://en.wikipedia.org/wiki/MapReduce - That it uses = as an expression, and the keyword `from` in a weird way that doesn't make sense to me. - The fact that it requires new syntax, so it isn't backwards compatible. Even if I loved it and your proposal was accepted, I couldn't use it for at least two years. If I'm writing a library that has to work with older versions of Python, probably not for a decade. - That there's no obvious search terms to google for if you come across this in code and don't know what it means ("that thing that looks like a list comprehension but has from in it"). (And yes, before you object, list comps have the same downside.) - The fact that this uses a functional idiom in the first place, which many people don't like or get. Especially when they start getting complex. If you haven't already already done so, you ought to read the numerous threads from last month on statement local name bindings: https://mail.python.org/pipermail/python-ideas/2018-March/thread.html The barrier to adding new syntax to the language is very high. I suspect that the *only* chance you have for this sort of comprehension will be if one of the name binding proposals is accepted. That will give you *half* of what you want: [(compute_avg(average, x) as average) for x in signal] [(average := compute_avg(average, x)) for x in signal] only needing a way to give it an initial value. Depending on the way comprehensions work, this might be all you need: average = 0 smooth_signal [(average := compute_avg(average, x)) for x in signal] assuming the := syntax is accepted. An alternative would be to push for a variant of functools.reduce that yields its values lazily, giving us: smooth_signal = list(lazy_reduce(compute_avg, x, 0)) -- Steve
On 04/05/2018 03:24 PM, Peter O'Connor wrote:
Well, whether you factor out the loop-function is a separate issue. Lets say we do:
smooth_signal = [average = compute_avg(average, x) for x in signal from average=0]
Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love?
It is not readable and it is not Python (and hopefully never will be). -- ~Ethan~
On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote:
On 04/05/2018 03:24 PM, Peter O'Connor wrote:
Well, whether you factor out the loop-function is a separate issue. Lets say we do:
smooth_signal = [average = compute_avg(average, x) for x in signal from average=0]
Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love?
It is not readable and it is not Python (and hopefully never will be).
Be fair. Strip out the last "from average = 0" and we have little that isn't either in Python or is currently being proposed elsewhere. Change the syntax for assignment within the comprehension to one of the preferred syntax variants from last month's "Statement local name bindings" thread, and we have something that is strongly being considered: [(average := compute_avg(average, x)) for x in signal] [(compute_avg(average, x) as average) for x in signal] All we need now is a way to feed in the initial value for average. And that could be as trival as assigning a local name for it: average = 0 before running the comprehension. -- Steve
On Fri, Apr 6, 2018 at 10:37 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote:
On 04/05/2018 03:24 PM, Peter O'Connor wrote:
Well, whether you factor out the loop-function is a separate issue. Lets say we do:
smooth_signal = [average = compute_avg(average, x) for x in signal from average=0]
Is just as readable and maintainable as your expanded version, but saves 4 lines of code. What's not to love?
It is not readable and it is not Python (and hopefully never will be).
Be fair. Strip out the last "from average = 0" and we have little that isn't either in Python or is currently being proposed elsewhere. Change the syntax for assignment within the comprehension to one of the preferred syntax variants from last month's "Statement local name bindings" thread, and we have something that is strongly being considered:
[(average := compute_avg(average, x)) for x in signal]
[(compute_avg(average, x) as average) for x in signal]
All we need now is a way to feed in the initial value for average. And that could be as trival as assigning a local name for it:
average = 0
before running the comprehension.
That would only work if the comprehension is executed in the same context as the surrounding code, instead of (as currently) being in a nested function. Otherwise, there'd need to be an initializer inside the comprehension - but that can be done (although it won't be particularly beautiful). ChrisA
On Fri, Apr 06, 2018 at 11:02:30AM +1000, Chris Angelico wrote:
On Fri, Apr 6, 2018 at 10:37 AM, Steven D'Aprano <steve@pearwood.info> wrote:
[...]
All we need now is a way to feed in the initial value for average. And that could be as trival as assigning a local name for it:
average = 0
before running the comprehension.
That would only work if the comprehension is executed in the same context as the surrounding code, instead of (as currently) being in a nested function. Otherwise, there'd need to be an initializer inside the comprehension - but that can be done (although it won't be particularly beautiful).
Not necessarily: we could keep the rule that comprehensions are executed in their own scope. We just add the rule that if a name is used as a sublocal name binding, then (and only then) it is initialised from the surrounding scopes. If there is no such surrounding name, then the sublocal remains uninitialised and trying to evaluate it will give UnboundLocalError. That's similar to how Lua works with locals/globals, and yes, I'm aware of the irony that I'm proposing this. I don't like the way it works in Lua where it applies *everywhere*, but I think it is justifiable and useful if applied specifically to comprehensions. A contrived example: suppose we want the running sum of a list, written as a list comprehension. This runs, but doesn't do what we want: [((x as spam) + spam) for x in [1, 2, 3]] => returns [2, 4, 6] This version fails as we try to evaluate spam before it is defined: [(spam + (x as spam)) for x in [1, 2, 3]] But if spam was copied from the surrounding scope, this would work: spam = 0 [(spam + (x as spam)) for x in [1, 2, 3]] => returns [1, 3, 5] and of course this would allow Peter's reduce/map without the ugly and ackward "from spam=0" initialiser syntax. (Sorry Peter.) If you don't like that implicit copying, let's make it explicit: spam = 0 [(spam + (x as nonlocal spam)) for x in [1, 2, 3]] (Should we allow global spam as well? Works for me.) Or if you prefer the Pascal-style assignment syntax that Guido favours: [(spam + (nonlocal spam := x)) for x in [1, 2, 3]] -- Steve
On 04/05/2018 05:37 PM, Steven D'Aprano wrote:
On Thu, Apr 05, 2018 at 05:31:41PM -0700, Ethan Furman wrote:
[snip unkind words]
Be fair. Strip out the last "from average = 0" and we have little that isn't either in Python or is currently being proposed elsewhere.
Ugh. Thanks for reminding me, Steven. Peter, my apologies. It's been a frustrating day for me and I shouldn't have taken it out on you. -- ~Ethan~
On Thu, Apr 5, 2018, 5:32 PM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I find this a bit awkward, and maintain that it would be nice to have this as a built-in language construct to do this natively. You have to admit:
smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Is a lot cleaner and more intuitive than:
dev compute_avg(avg, x): return (1 - decay)*avg + decay * x
The proposed syntax strikes me as confusing and mysterious to do something I do only occasionally. In contrast, itertools.accumulate() is straightforward and far more general. Definitely -100 on the proposal.
On Thu, Apr 05, 2018 at 12:52:17PM -0400, Peter O'Connor wrote:
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
I've already commented on this proposed syntax. A few further comments below.
Instead of:
def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
What I like about this is that it is testable in isolation and re- usable. It can be documented, the implementation changed if needed without having to touch all the callers of that function, and the name is descriptive. (I don't understand why so many people have such an aversion to writing functions and seek to eliminate them from their code.) Here's another solution which I like, one based on what we used to call coroutines until that term was taken for async functions. So keeping in mind that this version of "coroutine" has nothing to do with async: import functools def coroutine(func): """Decorator to prime coroutines when they are initialised.""" @functools.wraps(func) def started(*args, **kwargs): cr = func(*args,**kwargs) cr.send(None) return cr return started @coroutine def exponential_moving_average(decay=0.5): """Exponentially weighted moving average (EWMA). Coroutine returning a moving average with exponentially decreasing weights. By default the decay factor is one half, which is equivalent to averaging each value (after the first) with the previous moving average: >>> aver = exponential_moving_average() >>> [aver.send(x) for x in [5, 1, 2, 4.5]] [5, 3.0, 2.5, 3.5] """ average = (yield None) x = (yield average) while True: average = decay*x + (1-decay)*average x = (yield average) I wish this sort of coroutine were better known and loved. You can run more than one of them at once, you can feed values into them lazily, they can be paused and put aside to come back to them later, and if you want to use them eagerly, you can just drop them into a list comprehension. -- Steve
On 2018-04-05 21:18, Steven D'Aprano wrote:
(I don't understand why so many people have such an aversion to writing functions and seek to eliminate them from their code.)
I think I am one of those people that have an aversion to writing functions! I hope you do not mind that I attempt to explain my aversion here. I want to clarify my thoughts on this, and maybe others will find something useful in this explanation, maybe someone has wise words for me. I think this is relevant to python-ideas because someone with this aversion will make different language suggestions than those that don't. Here is why I have an aversion to writing functions: Every unread function represents multiple unknowns in the code. Every function adds to code complexity by mapping an inaccurate name to specific functionality. When I read code, this is what I see:
x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) y = you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, k, l)
Not everyone sees code this way: I see people read method calls, make a number of wild assumptions about how those methods work, AND THEY ARE CORRECT! How do they do it!? It is as if there are some unspoken convention about how code should work that's opaque to me. For example before I read the docs on itertools.accumulate(list_of_length_N, func), here are the unknowns I see: * Does it return N, or N-1 values? * How are initial conditions handled? * Must `func` perform the initialization by accepting just one parameter, and accumulate with more-than-one parameter? * If `func` is a binary function, and `accumulate` returns N values, what's the Nth value? * if `func` is a non-cummutative binary function, what order are the arguments passed? * Maybe accumulate expects func(*args)? * Is there a window size? Is it equal to the number of arguments of `func`? These are not all answered by reading the docs, they are answered by reading the code. The code tells me the first value is a special case; the first parameter of `func` is the accumulated `total`; `func` is applied in order; and an iterator is returned. Despite all my questions, notice I missed asking what `accumulate` returns? It is the unknown unknowns that get me most. So, `itertools.accumulate` is a kinda-inaccurate name given to a specific functionality: Not a problem on its own, and even delightfully useful if I need it often. What if I am in a domain where I see `accumulate` only a few times a year? Or how about a program that uses `accumulate` in only one place? For me, I must (re)read the `accumulate` source (or run the caller through the debugger) before I know what the code is doing. In these cases I advocate for in-lining the function code to remove these unknowns. Instead of an inaccurate name, there is explicit code. If we are lucky, that explicit code follows idioms that make the increased verbosity easier to read. Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
We see the initial conditions, we see the primary function, we see how the accumulation happens, we see the number of returned values, and we see it's a list. It is a compact, easy read, from top to bottom. Yes, we must know `for x in [y]` is an idiom for assignment, but we can reuse that knowledge in all our other list comprehensions. So, in the specific case of this Reduce-Map thread, I would advocate using the list comprehension. In general, all functions introduce non-trivial code debt: This debt is worth it if the function is used enough; but, in single-use or rare-use cases, functions can obfuscate. Thank you for your time.
Kyle Lahnakoski wrote:
Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
"Elegant" isn't the word I would use, more like "clever". Rather too clever, IMO -- it took me some head scratching to figure out how it does what it does. And it would have taken even more head scratching, except there's a clue as to *what* it's supposed to be doing: the fact that it's assigned to something called "smooth_signal" -- one of those "inaccurate names" that you disparage so much. :-) -- Greg
With the increased emphasis on iterators and generators in Python 3.x, the lack of a simple expression level equivalent to "for item in iterable: pass" is occasionally irritating, especially when demonstrating behaviour at the interactive prompt.
I've sometimes thought that exhaust(iterator) or iterator.exhaust() would be a good thing to have - I've often wrote code doing basically "call this function for every element in this container, and idc about return values", but find myself using a list comprehension instead of generator. I guess it's such an edge case that exhaust(iterator) as builtin would be overkill (but perhaps itertools could have it ?), and most people don't pass around iterators, so (f(x) for x in y).exhaust() might not look natural to most people. It could return the value for the last() semantics, but I think exhaustion would often be more important than the last value. 2018-04-09 0:58 GMT+02:00 Greg Ewing <greg.ewing@canterbury.ac.nz>:
Kyle Lahnakoski wrote:
Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
"Elegant" isn't the word I would use, more like "clever". Rather too clever, IMO -- it took me some head scratching to figure out how it does what it does.
And it would have taken even more head scratching, except there's a clue as to *what* it's supposed to be doing: the fact that it's assigned to something called "smooth_signal" -- one of those "inaccurate names" that you disparage so much. :-)
-- Greg
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
[Jacco van Dorp <j.van.dorp@deonet.nl>]
I've sometimes thought that exhaust(iterator) or iterator.exhaust() would be a good thing to have - I've often wrote code doing basically "call this function for every element in this container, and idc about return values", but find myself using a list comprehension instead of generator. I guess it's such an edge case that exhaust(iterator) as builtin would be overkill (but perhaps itertools could have it ?), and most people don't pass around iterators, so (f(x) for x in y).exhaust() might not look natural to most people.
"The standard" clever way to do this is to create a 0-sized deque:
from collections import deque deque((i for i in range(1000)), 0) deque([], maxlen=0)
The deque constructor consumes the entire iterable "at C speed", but throws all the results away because the deque's maximum size is too small to hold any of them ;-)
It could return the value for the last() semantics, but I think exhaustion would often be more important than the last value.
For last(),
deque((i for i in range(1000)), 1)[0] 999
In that case the deque only has enough room to remember one element, and so remembers the last one it sees. Of course this generalizes to larger values too:
for x in deque((i for i in range(1000)), 5): ... print(x) 995 996 997 998 999
I think I'd like to see itertools add a `drop(iterable, n=None)` function. If `n` is not given, it would consume the entire iterable. Else for an integer n >= 0, it would return an iterator that skips over the first `n` values of the input iterable. `drop n xs` has been in Haskell forever, and is also in the Python itertoolz package: http://toolz.readthedocs.io/en/latest/api.html#toolz.itertoolz.drop I'm not happy about switching the argument order from those, but would really like to omit `n` as a a way to spell "pretend n is infinity", so there would be no more need for the "empty deque" trick.
Greg Ewing writes:
Kyle Lahnakoski wrote:
Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
"Elegant" isn't the word I would use, more like "clever". Rather too clever, IMO -- it took me some head scratching to figure out how it does what it does.
After reading the thread where it was first mentioned (on what, I now forget; I guess it was a PEP 572 precursor discussion?), I cannot unsee the "variable for variable in singleton" initialization idiom. YMMV, of course. That's just my experience.
And it would have taken even more head scratching, except there's a clue as to *what* it's supposed to be doing: the fact that it's assigned to something called "smooth_signal"
Of course that hint was welcome, and hand to scalp motion was initiated. But then I "got it" and scratched my dog's head instead of my own. :-) Could we find a better syntax to express this? Probably, but none of the ones I've seen so far (including PEP 572) grab me and make my heart throb. Is this TOOWTDI? Not yet, and maybe never. But for now it works. Steve
Kyle, you sounded so reasonable when you were trashing itertools.accumulate (which I now agree is horrible). But then you go and support Serhiy's madness: "smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]" which I agree is clever, but reads more like a riddle than readable code. Anyway, I continue to stand by: (y:= f(y, x) for x in iter_x from y=initial_y) And, if that's not offensive enough, to its extension: (z, y := f(z, x) -> y for x in iter_x from z=initial_z) Which carries state "z" forward but only yields "y" at each iteration. (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst) Why am I so obsessed? Because it will allow you to conveniently replace classes with more clean, concise, functional code. People who thought they never needed such a construct may suddenly start finding it indispensable once they get used to it. How many times have you written something of the form?: class StatefulThing(object): def __init__(self, initial_state, param_1, param_2): self._param_1= param_1 self._param_2 = param_2 self._state = initial_state def update_and_get_output(self, new_observation): # (or just __call__) self._state = do_some_state_update(self._state, new_observation, self._param_1) output = transform_state_to_output(self._state, self._param_2) return output processor = StatefulThing(initial_state = initial_state, param_1 = 1, param_2 = 4) processed_things = [processor.update_and_get_output(x) for x in x_gen] I've done this many times. Video encoding, robot controllers, neural networks, any iterative machine learning algorithm, and probably lots of things I don't know about - they all tend to have this general form. And how many times have I had issues like "Oh no now I want to change param_1 on the fly instead of just setting it on initialization, I guess I have to refactor all usages of this class to pass param_1 into update_and_get_output instead of __init__". What if instead I could just write: def update_and_get_output(last_state, new_observation, param_1, param_2) new_state = do_some_state_update(last_state, new_observation, _param_1) output = transform_state_to_output(last_state, _param_2) return new_state, output processed_things = [state, output:= update_and_get_output(state, x, param_1=1, param_2=4) -> output for x in observations from state=initial_state] Now we have: - No mutable objects (which cuts of a whole slew of potential bugs and anti-patterns familiar to people who do OOP.) - Fewer lines of code - Looser assumptions on usage and less refactoring. (if I want to now pass in param_1 at each iteration instead of just initialization, I need to make no changes to update_and_get_output). - No need for state getters/setters, since state is is passed around explicitly. I realize that calling for changes to syntax is a lot to ask - but I still believe that the main objections to this syntax would also have been raised as objections to the now-ubiquitous list-comprehensions - they seem hostile and alien-looking at first, but very lovable once you get used to them. On Sun, Apr 8, 2018 at 1:41 PM, Kyle Lahnakoski <klahnakoski@mozilla.com> wrote:
On 2018-04-05 21:18, Steven D'Aprano wrote:
(I don't understand why so many people have such an aversion to writing functions and seek to eliminate them from their code.)
I think I am one of those people that have an aversion to writing functions!
I hope you do not mind that I attempt to explain my aversion here. I want to clarify my thoughts on this, and maybe others will find something useful in this explanation, maybe someone has wise words for me. I think this is relevant to python-ideas because someone with this aversion will make different language suggestions than those that don't.
Here is why I have an aversion to writing functions: Every unread function represents multiple unknowns in the code. Every function adds to code complexity by mapping an inaccurate name to specific functionality.
When I read code, this is what I see:
x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) y = you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, k, l)
Not everyone sees code this way: I see people read method calls, make a number of wild assumptions about how those methods work, AND THEY ARE CORRECT! How do they do it!? It is as if there are some unspoken convention about how code should work that's opaque to me.
For example before I read the docs on itertools.accumulate(list_of_length_N, func), here are the unknowns I see:
* Does it return N, or N-1 values? * How are initial conditions handled? * Must `func` perform the initialization by accepting just one parameter, and accumulate with more-than-one parameter? * If `func` is a binary function, and `accumulate` returns N values, what's the Nth value? * if `func` is a non-cummutative binary function, what order are the arguments passed? * Maybe accumulate expects func(*args)? * Is there a window size? Is it equal to the number of arguments of `func`?
These are not all answered by reading the docs, they are answered by reading the code. The code tells me the first value is a special case; the first parameter of `func` is the accumulated `total`; `func` is applied in order; and an iterator is returned. Despite all my questions, notice I missed asking what `accumulate` returns? It is the unknown unknowns that get me most.
So, `itertools.accumulate` is a kinda-inaccurate name given to a specific functionality: Not a problem on its own, and even delightfully useful if I need it often.
What if I am in a domain where I see `accumulate` only a few times a year? Or how about a program that uses `accumulate` in only one place? For me, I must (re)read the `accumulate` source (or run the caller through the debugger) before I know what the code is doing. In these cases I advocate for in-lining the function code to remove these unknowns. Instead of an inaccurate name, there is explicit code. If we are lucky, that explicit code follows idioms that make the increased verbosity easier to read.
Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
We see the initial conditions, we see the primary function, we see how the accumulation happens, we see the number of returned values, and we see it's a list. It is a compact, easy read, from top to bottom. Yes, we must know `for x in [y]` is an idiom for assignment, but we can reuse that knowledge in all our other list comprehensions. So, in the specific case of this Reduce-Map thread, I would advocate using the list comprehension.
In general, all functions introduce non-trivial code debt: This debt is worth it if the function is used enough; but, in single-use or rare-use cases, functions can obfuscate.
Thank you for your time.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
I continue to find all this weird new syntax to create absurdly long one-liners confusing and mysterious. Python is not Perl for a reason. On Mon, Apr 9, 2018, 5:55 PM Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Kyle, you sounded so reasonable when you were trashing itertools.accumulate (which I now agree is horrible). But then you go and support Serhiy's madness: "smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]" which I agree is clever, but reads more like a riddle than readable code.
Anyway, I continue to stand by:
(y:= f(y, x) for x in iter_x from y=initial_y)
And, if that's not offensive enough, to its extension:
(z, y := f(z, x) -> y for x in iter_x from z=initial_z)
Which carries state "z" forward but only yields "y" at each iteration. (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst)
Why am I so obsessed? Because it will allow you to conveniently replace classes with more clean, concise, functional code. People who thought they never needed such a construct may suddenly start finding it indispensable once they get used to it.
How many times have you written something of the form?:
class StatefulThing(object):
def __init__(self, initial_state, param_1, param_2): self._param_1= param_1 self._param_2 = param_2 self._state = initial_state
def update_and_get_output(self, new_observation): # (or just __call__) self._state = do_some_state_update(self._state, new_observation, self._param_1) output = transform_state_to_output(self._state, self._param_2) return output
processor = StatefulThing(initial_state = initial_state, param_1 = 1, param_2 = 4) processed_things = [processor.update_and_get_output(x) for x in x_gen]
I've done this many times. Video encoding, robot controllers, neural networks, any iterative machine learning algorithm, and probably lots of things I don't know about - they all tend to have this general form.
And how many times have I had issues like "Oh no now I want to change param_1 on the fly instead of just setting it on initialization, I guess I have to refactor all usages of this class to pass param_1 into update_and_get_output instead of __init__".
What if instead I could just write:
def update_and_get_output(last_state, new_observation, param_1, param_2) new_state = do_some_state_update(last_state, new_observation, _param_1) output = transform_state_to_output(last_state, _param_2) return new_state, output
processed_things = [state, output:= update_and_get_output(state, x, param_1=1, param_2=4) -> output for x in observations from state=initial_state]
Now we have: - No mutable objects (which cuts of a whole slew of potential bugs and anti-patterns familiar to people who do OOP.) - Fewer lines of code - Looser assumptions on usage and less refactoring. (if I want to now pass in param_1 at each iteration instead of just initialization, I need to make no changes to update_and_get_output). - No need for state getters/setters, since state is is passed around explicitly.
I realize that calling for changes to syntax is a lot to ask - but I still believe that the main objections to this syntax would also have been raised as objections to the now-ubiquitous list-comprehensions - they seem hostile and alien-looking at first, but very lovable once you get used to them.
On Sun, Apr 8, 2018 at 1:41 PM, Kyle Lahnakoski <klahnakoski@mozilla.com> wrote:
On 2018-04-05 21:18, Steven D'Aprano wrote:
(I don't understand why so many people have such an aversion to writing functions and seek to eliminate them from their code.)
I think I am one of those people that have an aversion to writing functions!
I hope you do not mind that I attempt to explain my aversion here. I want to clarify my thoughts on this, and maybe others will find something useful in this explanation, maybe someone has wise words for me. I think this is relevant to python-ideas because someone with this aversion will make different language suggestions than those that don't.
Here is why I have an aversion to writing functions: Every unread function represents multiple unknowns in the code. Every function adds to code complexity by mapping an inaccurate name to specific functionality.
When I read code, this is what I see:
x = you_will_never_guess_how_corner_cases_are_handled(a, b, c) y = you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j, k, l)
Not everyone sees code this way: I see people read method calls, make a number of wild assumptions about how those methods work, AND THEY ARE CORRECT! How do they do it!? It is as if there are some unspoken convention about how code should work that's opaque to me.
For example before I read the docs on itertools.accumulate(list_of_length_N, func), here are the unknowns I see:
* Does it return N, or N-1 values? * How are initial conditions handled? * Must `func` perform the initialization by accepting just one parameter, and accumulate with more-than-one parameter? * If `func` is a binary function, and `accumulate` returns N values, what's the Nth value? * if `func` is a non-cummutative binary function, what order are the arguments passed? * Maybe accumulate expects func(*args)? * Is there a window size? Is it equal to the number of arguments of `func`?
These are not all answered by reading the docs, they are answered by reading the code. The code tells me the first value is a special case; the first parameter of `func` is the accumulated `total`; `func` is applied in order; and an iterator is returned. Despite all my questions, notice I missed asking what `accumulate` returns? It is the unknown unknowns that get me most.
So, `itertools.accumulate` is a kinda-inaccurate name given to a specific functionality: Not a problem on its own, and even delightfully useful if I need it often.
What if I am in a domain where I see `accumulate` only a few times a year? Or how about a program that uses `accumulate` in only one place? For me, I must (re)read the `accumulate` source (or run the caller through the debugger) before I know what the code is doing. In these cases I advocate for in-lining the function code to remove these unknowns. Instead of an inaccurate name, there is explicit code. If we are lucky, that explicit code follows idioms that make the increased verbosity easier to read.
Consider Serhiy Storchaka's elegant solution, which I reformatted for readability
smooth_signal = [ average for average in [0] for x in signal for average in [(1-decay)*average + decay*x] ]
We see the initial conditions, we see the primary function, we see how the accumulation happens, we see the number of returned values, and we see it's a list. It is a compact, easy read, from top to bottom. Yes, we must know `for x in [y]` is an idiom for assignment, but we can reuse that knowledge in all our other list comprehensions. So, in the specific case of this Reduce-Map thread, I would advocate using the list comprehension.
In general, all functions introduce non-trivial code debt: This debt is worth it if the function is used enough; but, in single-use or rare-use cases, functions can obfuscate.
Thank you for your time.
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Le 10/04/2018 à 00:54, Peter O'Connor a écrit :
Kyle, you sounded so reasonable when you were trashing itertools.accumulate (which I now agree is horrible). But then you go and support Serhiy's madness: "smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]" which I agree is clever, but reads more like a riddle than readable code.
Anyway, I continue to stand by:
(y:= f(y, x) for x in iter_x from y=initial_y)
And, if that's not offensive enough, to its extension:
(z, y := f(z, x) -> y for x in iter_x from z=initial_z)
Which carries state "z" forward but only yields "y" at each iteration. (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst <https://github.com/petered/peps/blob/master/pep-9999.rst>)
Why am I so obsessed? Because it will allow you to conveniently replace classes with more clean, concise, functional code. People who thought they never needed such a construct may suddenly start finding it indispensable once they get used to it.
How many times have you written something of the form?:
class StatefulThing(object): def __init__(self, initial_state, param_1, param_2): self._param_1= param_1 self._param_2 = param_2 self._state = initial_state def update_and_get_output(self, new_observation): # (or just __call__) self._state = do_some_state_update(self._state, new_observation, self._param_1) output = transform_state_to_output(self._state, self._param_2) return output processor = StatefulThing(initial_state = initial_state, param_1 = 1, param_2 = 4) processed_things = [processor.update_and_get_output(x) for x in x_gen] I've done this many times. Video encoding, robot controllers, neural networks, any iterative machine learning algorithm, and probably lots of things I don't know about - they all tend to have this general form.
Personally I never have to do that very often. But let's say for the sake of the argument there is a class of problem a part of the Python community often solves with this pattern. After all, Python is a versatile language with a very large and diverse user base. First, why a class would be a bad thing ? It's clear, easy to understand, debug and extend. Besides, do_some_state_update and transform_state_to_output may very well be methods. Second, if you really don't want a class, use a coroutine, that's exactly what they are for: def stateful_thing(state, param_1, param_2, output=None): while True: new_observation = yield output state = do_some_state_update(state, new_observation, param_1) output = transform_state_to_output(state, param_2) processor = stateful_thing(1, 1, 4) next(processor) processed_things = [processor.send(x) for x in x_gen] If you have that much of a complex workflow, you really should not make that a one-liner. And before trying to ask for a new syntax in the language, try to solve the problem with the existing tools. I know, I get the frustration. I've been trying to get slicing on generators and inline try/except on this mailing list for years and I've been said no again and again. It's hard. But it's also why Python stayed sane for decades.
First, why a class would be a bad thing ? It's clear, easy to understand, debug and extend.
- Lots of reduntand-looking "frameworky" lines of code: "self._param_1 = param_1" - Potential for opaque state changes: Caller doesn't know if "y=my_object.do_something(x)" has any side-effect, whereas with ("y, new_state=do_something(state, x)" / "y=do_something(state, x)") it's clear that there (is / is not). - Makes more assumptions on usage (should I add "param_1" as an arg to "StatefulThing.__init__" or to "StatefulThing.update_and_get_output"
And before trying to ask for a new syntax in the language, try to solve the problem with the existing tools.
Oh I have, and of course there are ways but I find them all clunkier than needed. I added your coroutine to the freak show: https://github.com/petered/peters_example_code/blob/master/peters_example_co...
processor = stateful_thing(1, 1, 4) next(processor) processed_things = [processor.send(x) for x in x_gen]
I *almost* like the coroutine thing but find it unusable because the peculiarity of having to initialize the generator when you use it (you do it with next(processor)) is pretty much guaranteed to lead to errors when people forget to do it. Earlier in the thread Steven D'Aprano showed how a @coroutine decorator can get around this: https://github.com/petered/peters_example_code/blob/master/peters_example_co... - Still, the whole coroutine thing still feels a bit magical, hacky and "clever". Also the use of generator.send will probably confuse around 90% of programmers. If you have that much of a complex workflow, you really should not make
that a one-liner.
It's not a complex workflow, it's a moving average. It just seems complex because we don't have a nice, compact way to describe it. I've been trying to get slicing on generators and inline try/except on
this mailing list for years and I've been said no again and again. It's hard. But it's also why Python stayed sane for decades.
Hey I'll support your campaign if you support mine. On Tue, Apr 10, 2018 at 4:18 AM, Michel Desmoulin <desmoulinmichel@gmail.com
wrote:
Le 10/04/2018 à 00:54, Peter O'Connor a écrit :
Kyle, you sounded so reasonable when you were trashing itertools.accumulate (which I now agree is horrible). But then you go and support Serhiy's madness: "smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]" which I agree is clever, but reads more like a riddle than readable code.
Anyway, I continue to stand by:
(y:= f(y, x) for x in iter_x from y=initial_y)
And, if that's not offensive enough, to its extension:
(z, y := f(z, x) -> y for x in iter_x from z=initial_z)
Which carries state "z" forward but only yields "y" at each iteration. (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst <https://github.com/petered/peps/blob/master/pep-9999.rst>)
Why am I so obsessed? Because it will allow you to conveniently replace classes with more clean, concise, functional code. People who thought they never needed such a construct may suddenly start finding it indispensable once they get used to it.
How many times have you written something of the form?:
class StatefulThing(object):
def __init__(self, initial_state, param_1, param_2): self._param_1= param_1 self._param_2 = param_2 self._state = initial_state
def update_and_get_output(self, new_observation): # (or just __call__) self._state = do_some_state_update(self._state, new_observation, self._param_1) output = transform_state_to_output(self._state, self._param_2) return output
processor = StatefulThing(initial_state = initial_state, param_1 = 1, param_2 = 4) processed_things = [processor.update_and_get_output(x) for x in x_gen]
I've done this many times. Video encoding, robot controllers, neural networks, any iterative machine learning algorithm, and probably lots of things I don't know about - they all tend to have this general form.
Personally I never have to do that very often. But let's say for the sake of the argument there is a class of problem a part of the Python community often solves with this pattern. After all, Python is a versatile language with a very large and diverse user base.
First, why a class would be a bad thing ? It's clear, easy to understand, debug and extend. Besides, do_some_state_update and transform_state_to_output may very well be methods.
Second, if you really don't want a class, use a coroutine, that's exactly what they are for:
def stateful_thing(state, param_1, param_2, output=None): while True: new_observation = yield output state = do_some_state_update(state, new_observation, param_1) output = transform_state_to_output(state, param_2)
processor = stateful_thing(1, 1, 4) next(processor) processed_things = [processor.send(x) for x in x_gen]
If you have that much of a complex workflow, you really should not make that a one-liner.
And before trying to ask for a new syntax in the language, try to solve the problem with the existing tools.
I know, I get the frustration.
I've been trying to get slicing on generators and inline try/except on this mailing list for years and I've been said no again and again. It's hard. But it's also why Python stayed sane for decades. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote: [...]
I added your coroutine to the freak show:
Peter, I realise that you're a fan of functional programming idioms, and I'm very sympathetic to that. I'm a fan of judicious use of FP too, and while I'm not keen on your specific syntax, I am interested in the general concept and would like it to have the best possible case made for it. But even I find your use of dysphemisms like "freak show" for non-FP solutions quite off-putting. (I think this is the second time you've used the term.) Python is not a functional programming language like Haskell, it is a multi-paradigm language with strong support for OO and procedural idioms. Notwithstanding the problems with OO idioms that you describe, many Python programmers find OO "better", simpler to understand, learn and maintain than FP. Or at least more familiar. The rejection or approval of features into Python is not a popularity contest, ultimately it only requires one person (Guido) to either reject or approve a new feature. But popular opinion is not irrelevant either: like all benevolent dictators, Guido has a good sense of what's popular, and takes it into account in his considerations. If you put people off-side, you hurt your chances of having this feature approved. [...]
I *almost* like the coroutine thing but find it unusable because the peculiarity of having to initialize the generator when you use it (you do it with next(processor)) is pretty much guaranteed to lead to errors when people forget to do it. Earlier in the thread Steven D'Aprano showed how a @coroutine decorator can get around this:
I agree that the (old-style, pre-async) coroutine idiom is little known, in part because of the awkwardness needed to make it work. Nevertheless, I think your argument about it leading to errors is overstated: if you forget to initialize the coroutine, you get a clear and obvious failure: py> def co(): ... x = (yield 1) ... py> a = co() py> a.send(99) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't send non-None value to a just-started generator
- Still, the whole coroutine thing still feels a bit magical, hacky and "clever". Also the use of generator.send will probably confuse around 90% of programmers.
In my experience, heavy use of FP idioms will probably confuse about the same percentage. Including me: I like FP in moderation, I wouldn't want to use a strict 100% functional language, and if someone even says the word "Monad" I break out in hives.
If you have that much of a complex workflow, you really should not make
that a one-liner.
It's not a complex workflow, it's a moving average. It just seems complex because we don't have a nice, compact way to describe it.
Indeed. But it seems to me that itertools.accumulate() with a initial value probably will solve that issue. Besides... moving averages aren't that common that they *necessarily* need syntactic support. Wrapping the complexity in a function, then calling the function, may be an acceptible solution instead of putting the complexity directly into the language itself. The Conservation Of Complexity Principle suggests that complexity cannot be created or destroyed, only moved around. If we reduce the complexity of the Python code needed to write a moving average, we invariably increase the complexity of the language, the interpreter, and the amount of syntax people need to learn in order to be productive with Python. -- Steve
On 10/04/18 18:32, Steven D'Aprano wrote:
On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote:
[...]
I added your coroutine to the freak show: Peter, I realise that you're a fan of functional programming idioms, and I'm very sympathetic to that. I'm a fan of judicious use of FP too, and while I'm not keen on your specific syntax, I am interested in the general concept and would like it to have the best possible case made for it.
But even I find your use of dysphemisms like "freak show" for non-FP solutions quite off-putting. (I think this is the second time you've used the term.)
Thank you for saying that, Steven. I must admit I was beginning to find the implicit insults rather grating. -- Rhodri James *-* Kynesim Ltd
But even I find your use of dysphemisms like "freak show" for non-FP solutions quite off-putting.
Ah, I'm sorry, "freak show" was not mean to be disparaging to the authors or even the code itself, but to describe the variety of strange solutions (my own included) to this simple problem. Indeed. But it seems to me that itertools.accumulate() with a initial value
probably will solve that issue.
Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread, and Tim Peters made the point that it's non-initialized behaviour can be extremely unintuitive (try "print(list(itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y))))" ). These convinced me that that itertools.accumulate should be avoided altogether. Alternatively, if anyone has a proposed syntax that does the same thing as Serhiy Storchaka's: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market. On Tue, Apr 10, 2018 at 1:32 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Tue, Apr 10, 2018 at 12:18:27PM -0400, Peter O'Connor wrote:
[...]
I added your coroutine to the freak show:
Peter, I realise that you're a fan of functional programming idioms, and I'm very sympathetic to that. I'm a fan of judicious use of FP too, and while I'm not keen on your specific syntax, I am interested in the general concept and would like it to have the best possible case made for it.
But even I find your use of dysphemisms like "freak show" for non-FP solutions quite off-putting. (I think this is the second time you've used the term.)
Python is not a functional programming language like Haskell, it is a multi-paradigm language with strong support for OO and procedural idioms. Notwithstanding the problems with OO idioms that you describe, many Python programmers find OO "better", simpler to understand, learn and maintain than FP. Or at least more familiar.
The rejection or approval of features into Python is not a popularity contest, ultimately it only requires one person (Guido) to either reject or approve a new feature. But popular opinion is not irrelevant either: like all benevolent dictators, Guido has a good sense of what's popular, and takes it into account in his considerations. If you put people off-side, you hurt your chances of having this feature approved.
[...]
I *almost* like the coroutine thing but find it unusable because the peculiarity of having to initialize the generator when you use it (you do it with next(processor)) is pretty much guaranteed to lead to errors when people forget to do it. Earlier in the thread Steven D'Aprano showed how a @coroutine decorator can get around this:
I agree that the (old-style, pre-async) coroutine idiom is little known, in part because of the awkwardness needed to make it work. Nevertheless, I think your argument about it leading to errors is overstated: if you forget to initialize the coroutine, you get a clear and obvious failure:
py> def co(): ... x = (yield 1) ... py> a = co() py> a.send(99) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: can't send non-None value to a just-started generator
- Still, the whole coroutine thing still feels a bit magical, hacky and "clever". Also the use of generator.send will probably confuse around 90% of programmers.
In my experience, heavy use of FP idioms will probably confuse about the same percentage. Including me: I like FP in moderation, I wouldn't want to use a strict 100% functional language, and if someone even says the word "Monad" I break out in hives.
If you have that much of a complex workflow, you really should not make
that a one-liner.
It's not a complex workflow, it's a moving average. It just seems complex because we don't have a nice, compact way to describe it.
Indeed. But it seems to me that itertools.accumulate() with a initial value probably will solve that issue.
Besides... moving averages aren't that common that they *necessarily* need syntactic support. Wrapping the complexity in a function, then calling the function, may be an acceptible solution instead of putting the complexity directly into the language itself.
The Conservation Of Complexity Principle suggests that complexity cannot be created or destroyed, only moved around. If we reduce the complexity of the Python code needed to write a moving average, we invariably increase the complexity of the language, the interpreter, and the amount of syntax people need to learn in order to be productive with Python.
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 10 April 2018 at 19:25, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread
I wouldn't call it a "pretty good case". He argued that writing *functions* was a bad thing, because the name of a function didn't provide all the details of what was going on in the same way that explicitly writing the code inline would do. That seems to me to be a somewhat bizarre argument - after all, encapsulation and abstraction are pretty fundamental to programming. I'm not even sure he had any specific comments about accumulate other than his general point that as a named function it's somehow worse than writing out the explicit loop.
But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market.
It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it". Paul
On Tue, Apr 10, 2018 at 08:12:14PM +0100, Paul Moore wrote:
On 10 April 2018 at 19:25, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Kyle Lahnakoski made a pretty good case for not using itertools.accumulate() earlier in this thread
I wouldn't call it a "pretty good case". He argued that writing *functions* was a bad thing, because the name of a function didn't provide all the details of what was going on in the same way that explicitly writing the code inline would do. That seems to me to be a somewhat bizarre argument - after all, encapsulation and abstraction are pretty fundamental to programming. I'm not even sure he had any specific comments about accumulate other than his general point that as a named function it's somehow worse than writing out the explicit loop.
I agree with Paul here -- I think that Kyle's argument is idiosyncratic. It isn't going to stop me from writing functions :-)
But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market.
It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it".
I'm afraid I'm going to (mildly) object here. At least you didn't misquote the Zen as "Only One Way To Do It" :-) The Zen here is not a prohibition against there being multiple ways to do something -- how could it, given that Python is a general purpose programming language there is always going to be multiple ways to write any piece of code? Rather, it exhorts us to make sure that there are one or more ways to "do it", at least one of which is obvious. And since "it" is open to interpretation, we can legitimately wonder whether (for example): - for loops - list comprehensions - list(generator expression) etc are three different ways to do "it", or three different "it"s. If we wish to dispute the old slander that Python has Only One Way to do anything, then we can emphasise the similarities and declare them three ways; if we want to defend the Zen, we can emphasise the differences and declare them to be three different "it"s. So I think Peter is on reasonable ground to suggest this, if he can make a good enough case for it. Personally, I still think the best approach here is a combination of itertools.accumulate, and the proposed name-binding as an expression feature: total = 0 running_totals = [(total := total + x) for x in values] # alternative syntax running_totals = [(total + x as total) for x in values] If you don't like the dependency on an external variable (or if that turns out not to be practical) then we could have: running_totals = [(total := total + x) for total in [0] for x in values] -- Steve
On Wed, Apr 11, 2018 at 1:41 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Personally, I still think the best approach here is a combination of itertools.accumulate, and the proposed name-binding as an expression feature:
total = 0 running_totals = [(total := total + x) for x in values] # alternative syntax running_totals = [(total + x as total) for x in values]
If you don't like the dependency on an external variable (or if that turns out not to be practical) then we could have:
running_totals = [(total := total + x) for total in [0] for x in values]
That last one works, but it's not exactly pretty. Using an additional 'for' loop to initialize variables feels like a gross hack. Unfortunately, the first one is equivalent to this (in a PEP 572 world): total = 0 def <listcomp>(): result = [] for x in values: result.push(total := total + x) return result running_totals = <listcomp>() Problem: it's still happening in a function, which means this bombs with UnboundLocalError. Solution 1: Use the extra loop to initialize 'total' inside the comprehension. Ugly. Solution 2: Completely redefine comprehensions to use subscopes instead of a nested function. I used to think this was a good thing, but after the discussions here, I've found that this creates as many problems as it solves. Solution 3: Have some way for a comprehension to request that a name be imported from the surrounding context. Effectively this: total = 0 def <listcomp>(total=total): result = [] for x in values: result.push(total := total + x) return result running_totals = <listcomp>() This is how, in a PEP 572 world, the oddities of class scope are resolved. (I'll be posting a new PEP as soon as I fix up three failing CPython tests.) It does have its own problems, though. How do you know which names to import like that? What if 'total' wasn't assigned to right there, but instead was being lifted from a scope further out? Solution 4: Have *all* local variables in a comprehension get initialized to None. def <listcomp>(): result = [] total = x = None for x in values: result.push(total := (total or 0) + x) return result running_totals = <listcomp>() running_totals = [(total := (total or 0) + x) for total in [0] for x in values] That'd add to the run-time cost of every list comp, but probably not measurably. (Did you know, for instance, that "except Exception as e:" will set e to None before unbinding it?) It's still not exactly pretty, though, and having to explain why you have "or 0" in a purely arithmetic operation may not quite work. Solution 5: Allow an explicit initializer syntax. Could work, but you'd have to come up with one that people are happy with. None is truly ideal IMO. ChrisA
On 11 April 2018 at 04:41, Steven D'Aprano <steve@pearwood.info> wrote:
But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market.
It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it".
I'm afraid I'm going to (mildly) object here. At least you didn't misquote the Zen as "Only One Way To Do It" :-)
The Zen here is not a prohibition against there being multiple ways to do something -- how could it, given that Python is a general purpose programming language there is always going to be multiple ways to write any piece of code? Rather, it exhorts us to make sure that there are one or more ways to "do it", at least one of which is obvious.
I apologise if I came across as implying that I thought the Zen said that having multiple ways was prohibited. I don't (and certainly the Zen doesn't mean that). Rather, I was saying that using "it gives us an additional way to do something" is a bad argument in favour of a proposal for Python. At a minimum, the proposal needs to argue why the new feature is "more obvious" than the existing ways (bonus points if the proposer is Dutch - see the following Zen item ;-)), or why it offers a capability that isn't possible with the existing language. And I'm not even saying that the OP hasn't attempted to make such arguments (even if I disagree with them). All I was pointing out was that the comment "it would be great to have more options on the market" implies a misunderstanding of the design goals of Python (hence my "reminder" of the principle I think is relevant here). Sorry again if that's not what it sounded like. Paul
It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it".
I've got to start minding my words more. By "options on the market" I more meant it in a "candidates for the job" sense. As in in the end we'd select just one, which would in retrospect or if Dutch would seem like the obvious choice. Not that "everyone who uses Python should have more ways to do this". My reason for starting this is that there isn't "one obvious way" to do this type of operation now (as the diversity of the exponential-moving-average "zoo" <https://github.com/petered/peters_example_code/blob/master/peters_example_co...> attests) ------ Let's look at a task where there is "one obvious way" Suppose someone asks: "How can I build a list of squares of the first 100 odd numbers [1, 9, 25, 49, ....] in Python?" The answer is now obvious - few people would do this: list_of_odd_squares = [] for i in range(100): list_of_odd_squares.append((i*2+1)**2) or this: def iter_odd_squares(n)): for i in range(n): yield (i*2+1)**2 list_of_odd_squares = list(iter_odd_squares(100)) Because it's just more clean, compact, readable and "obvious" to do: list_of_even_squares = [(i*2+1)**2 for i in range(100)] Maybe I'm being presumptuous, but I think most Python users would agree. ------- Now lets switch our task computing the exponential moving average of a list. This is a stand-in for a HUGE range of tasks that involve carrying some state-variable forward while producing values. Some would do this: smooth_signal = [] average = 0 for x in signal: average = (1-decay)*average + decay*x smooth_signal.append(average) Some would do this: def moving_average(signal, decay, initial=0): average = initial for x in signal: average = (1-decay)*average + decay*x yield average smooth_signal = list(moving_average(signal, decay=decay)) Lovers of one-liners like Serhiy would do this: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] Some would scoff at the cryptic one-liner and do this: def update_moving_average(avg, x, decay): return (1-decay)*avg + decay*x smooth_signal = list(itertools.accumulate(itertools.chain([0], signal), func=functools.partial(update_moving_average, decay=decay))) And others would scoff at that and make make a class, or use coroutines. ------ There've been many suggestions in this thread (all documented here: https://github.com/petered/peters_example_code/blob/master/peters_example_co...) and that's good, but it seems clear that people do not agree on an "obvious" way to do things. I claim that if smooth_signal = [average := (1-decay)*average + decay*x for x in signal from average=0.] Were allowed, it would become the "obvious" way. Chris Angelico's suggestions are close to this and have the benefit of requiring no new syntax in a PEP 572 world : smooth_signal = [(average := (1-decay)*average + decay*x) for average in [0] for x in signal] or smooth_signal = [(average := (1-decay)*(average or 0) + decay*x) for x in signal] or average = 0 smooth_signal = [(average := (1-decay)*average + decay*x) for x in signal] But they all have oddities that detract from their "obviousness" and the oddities stem from there not being a built-in way to initialize. In the first, there is the odd "for average in [0]" initializer.. The second relies on a hidden "average = None" which is not obvious at all, and the third has the problem that the initial value is bound to the defining scope instead of belonging to the generator. All seem to have oddly redundant brackets whose purpose is not obvious, but maybe there's a good reason for that. If people are happy with these solutions and still see no need for the initialization syntax, we can stop this, but as I see it there is a "hole" in the language that needs to be filled. On Wed, Apr 11, 2018 at 3:55 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 11 April 2018 at 04:41, Steven D'Aprano <steve@pearwood.info> wrote:
But in a way that more intuitively expresses the intent of the code, it would be great to have more options on the market.
It's worth adding a reminder here that "having more options on the market" is pretty directly in contradiction to the Zen of Python - "There should be one-- and preferably only one --obvious way to do it".
I'm afraid I'm going to (mildly) object here. At least you didn't misquote the Zen as "Only One Way To Do It" :-)
The Zen here is not a prohibition against there being multiple ways to do something -- how could it, given that Python is a general purpose programming language there is always going to be multiple ways to write any piece of code? Rather, it exhorts us to make sure that there are one or more ways to "do it", at least one of which is obvious.
I apologise if I came across as implying that I thought the Zen said that having multiple ways was prohibited. I don't (and certainly the Zen doesn't mean that). Rather, I was saying that using "it gives us an additional way to do something" is a bad argument in favour of a proposal for Python. At a minimum, the proposal needs to argue why the new feature is "more obvious" than the existing ways (bonus points if the proposer is Dutch - see the following Zen item ;-)), or why it offers a capability that isn't possible with the existing language. And I'm not even saying that the OP hasn't attempted to make such arguments (even if I disagree with them). All I was pointing out was that the comment "it would be great to have more options on the market" implies a misunderstanding of the design goals of Python (hence my "reminder" of the principle I think is relevant here).
Sorry again if that's not what it sounded like. Paul _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Thu, Apr 12, 2018 at 12:37 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Let's look at a task where there is "one obvious way"
Suppose someone asks: "How can I build a list of squares of the first 100 odd numbers [1, 9, 25, 49, ....] in Python?" The answer is now obvious - few people would do this:
list_of_odd_squares = [] for i in range(100): list_of_odd_squares.append((i*2+1)**2)
or this:
def iter_odd_squares(n)): for i in range(n): yield (i*2+1)**2
list_of_odd_squares = list(iter_odd_squares(100))
Because it's just more clean, compact, readable and "obvious" to do:
list_of_even_squares = [(i*2+1)**2 for i in range(100)]
Maybe I'm being presumptuous, but I think most Python users would agree.
Or: squares = [i**2 for i in range(1, 200, 2)] So maybe even the obvious examples aren't quite as obvious as you might think. ChrisA
On 11 April 2018 at 15:37, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
If people are happy with these solutions and still see no need for the initialization syntax, we can stop this, but as I see it there is a "hole" in the language that needs to be filled.
Personally, I'm happy with those solutions and see no need for the initialisation syntax. In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details. Paul
On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details.
I'm all in favour of hiding things away into functions - I just think those functions should be as basic as possible, without implicit assumptions about how they will be used. Let me give an example: ---- Lets look at your preferred method (A): def moving_average(signal_iterable, decay, initial=0): last_average = initial for x in signal_iterable: last_average = (1-decay)*last_average + decay*x yield last_average moving_average_gen = moving_average(signal, decay=decay, initial=initial) And compare it with (B), which would require the proposed syntax: def moving_average_step(last_average, x, decay): return (1-decay)*last_average + decay*x moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x in signal from x=initial) ----- Now, suppose we want to change things so that the "decay" changes with every step. The moving_average function (A) now has to be changed, because what we once thought would be a fixed parameter is now a variable that changes between calls. Our options are: - Make "decay" another iterable (in which case other functions calling "moving_average" need to be changed). - Leave an option for "decay" to be a float which gets transformed to an iterable with "decay_iter = (decay for _ in itertools.count(0)) if isinstance(decay, (int, float)) else decay". (awkward because 95% of usages don't need this. If you do this for more parameters you suddenly have this weird implementation with iterators everywhere even though in most cases they're not needed). - Factor out the "pure" "moving_average_step" from "moving_average", and create a new "moving_average_with_dynamic_decay" wrapper (but now we have to maintain two wrappers - with the duplicated arguments - which starts to require a lot of maintenance when you're passing down several parameters (or you can use the dreaded **kwargs). With approach (B) on the other hand, "moving_average_step" and all the functions calling it, can stay the same: we just change the way we call it in this instance to: moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x, decay in zip(signal, decay_schedule) from x=initial) ---- Now lets imagine this were a more complex function with 10 parameters. I see these kind of examples a lot in machine-learning and robotics programs, where you'll have parameters like "learning rate", "regularization", "minibatch_size", "maximum_speed", "height_of_camera" which might initially be considered initialization parameters, but then later it turns out they need to be changed dynamically. This is why I think the "(y:=f(y, x) for x in xs from y=initial)" syntax can lead to cleaner, more maintainable code. On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 11 April 2018 at 15:37, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
If people are happy with these solutions and still see no need for the initialization syntax, we can stop this, but as I see it there is a "hole" in the language that needs to be filled.
Personally, I'm happy with those solutions and see no need for the initialisation syntax.
In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details.
Paul
* correction to example: moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x in signal from average=initial) On Thu, Apr 12, 2018 at 3:37 PM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details.
I'm all in favour of hiding things away into functions - I just think those functions should be as basic as possible, without implicit assumptions about how they will be used. Let me give an example:
----
Lets look at your preferred method (A):
def moving_average(signal_iterable, decay, initial=0): last_average = initial for x in signal_iterable: last_average = (1-decay)*last_average + decay*x yield last_average
moving_average_gen = moving_average(signal, decay=decay, initial=initial)
And compare it with (B), which would require the proposed syntax:
def moving_average_step(last_average, x, decay): return (1-decay)*last_average + decay*x
moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x in signal from x=initial)
-----
Now, suppose we want to change things so that the "decay" changes with every step.
The moving_average function (A) now has to be changed, because what we once thought would be a fixed parameter is now a variable that changes between calls. Our options are: - Make "decay" another iterable (in which case other functions calling "moving_average" need to be changed). - Leave an option for "decay" to be a float which gets transformed to an iterable with "decay_iter = (decay for _ in itertools.count(0)) if isinstance(decay, (int, float)) else decay". (awkward because 95% of usages don't need this. If you do this for more parameters you suddenly have this weird implementation with iterators everywhere even though in most cases they're not needed). - Factor out the "pure" "moving_average_step" from "moving_average", and create a new "moving_average_with_dynamic_decay" wrapper (but now we have to maintain two wrappers - with the duplicated arguments - which starts to require a lot of maintenance when you're passing down several parameters (or you can use the dreaded **kwargs).
With approach (B) on the other hand, "moving_average_step" and all the functions calling it, can stay the same: we just change the way we call it in this instance to:
moving_average_gen = (average:= moving_average_step(average, x, decay=decay) for x, decay in zip(signal, decay_schedule) from x=initial)
----
Now lets imagine this were a more complex function with 10 parameters. I see these kind of examples a lot in machine-learning and robotics programs, where you'll have parameters like "learning rate", "regularization", "minibatch_size", "maximum_speed", "height_of_camera" which might initially be considered initialization parameters, but then later it turns out they need to be changed dynamically.
This is why I think the "(y:=f(y, x) for x in xs from y=initial)" syntax can lead to cleaner, more maintainable code.
On Wed, Apr 11, 2018 at 10:50 AM, Paul Moore <p.f.moore@gmail.com> wrote:
On 11 April 2018 at 15:37, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
If people are happy with these solutions and still see no need for the initialization syntax, we can stop this, but as I see it there is a "hole" in the language that needs to be filled.
Personally, I'm happy with those solutions and see no need for the initialisation syntax.
In particular, I'm happiest with the named moving_average() function, which may reflect to some extent my lack of familiarity with the subject area. I don't *care* how it's implemented internally - an explicit loop is fine with me, but if a domain expert wants to be clever and use something more complex, I don't need to know. An often missed disadvantage of one-liners is that they get put inline, meaning that people looking for a higher level overview of what the code does get confronted with all the gory details.
Paul
On Wed, Apr 11, 2018 at 1:41 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Personally, I still think the best approach here is a combination of itertools.accumulate, and the proposed name-binding as an expression feature:
total = 0 running_totals = [(total := total + x) for x in values] # alternative syntax running_totals = [(total + x as total) for x in values]
If you don't like the dependency on an external variable (or if that turns out not to be practical) then we could have:
running_totals = [(total := total + x) for total in [0] for x in values]
Linking this to the PEP 572 thread, this is an open question now: https://www.python.org/dev/peps/pep-0572/#importing-names-into-comprehension... Anyone who's interested in (or intrigued by) this potential syntax is very much welcome to hop over to the PEP 572 threads and join in. ChrisA
On 2018-04-08 10:41, Kyle Lahnakoski wrote:
For example before I read the docs on itertools.accumulate(list_of_length_N, func), here are the unknowns I see:
It sounds like you're saying you don't like using functions because you have to read documentation. That may be so, but I don't have much sympathy for that position. One of the most useful features of functions is that they exist as defined chunks of code that can be explicitly documented. Snippets of inline code are harder to document and harder to "address" in the sense of identifying precisely which chunk of code is being documented. If the documentation for accumulate doesn't give the information that people using it need to know, that's a documentation bug for sure, but it doesn't mean we should stop using functions. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown
05.04.18 19:52, Peter O'Connor пише:
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1)for iin range(1000)] smooth_signal = [average = (1-decay)*average + decay*xfor xin signalfrom average=0.]
Using currently supported syntax: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]
Hi all, thank you for the feedback. I laughed, I cried, and I learned. I looked over all your suggestions and recreated them here: https://github.com/petered/peters_example_code/blob/master/peters_example_co... I still favour my (y = f(y, x) for x in xs from y=initializer) syntax for a few reasons: 1) By adding an "initialized generator" as a special language construct, we could add a "last" builtin (similar to "next") so that "last(initialized_generator)" returns the initializer if the initialized_generator yields no values (and thus replaces reduce). 2) Declaring the initial value as part of the generator lets us pass around the generator around so it can be run in other scopes without it keeping alive the scope it's defined in, and bringing up awkward questions like "What if the initializer variable in the scope that created the generator changes after the generator is defined but before it is used?" 3) The idea that an assignment operation "a = f()" returns a value (a) is already consistent with the "chained assignment" syntax of "b=a=f()" (which can be thought of as "b=(a=f())"). I don't know why we feel the need for new constructs like "(a:=f())" or "(f() as a)" when we could just think of assignments as returning values (unless that breaks something that I'm not aware of) However, it looks like I'd be fighting a raging current if I were to try and push this proposal. It's also encouraging that most of the work would be done anyway if ("Statement Local Name Bindings") thread passes. So some more humble proposals would be: 1) An initializer to itertools.accumulate functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate 2) Assignment returns a value (basically what's already in the "Statement local name bindings" discussion) `a=f()` returns a value of a This would allow updating variables in a generator (I don't see the need for ":=" or "f() as a") but that's another discussion Is there any interest (or disagreement) to these more humble proposals? - Peter On Fri, Apr 6, 2018 at 2:19 AM, Serhiy Storchaka <storchaka@gmail.com> wrote:
05.04.18 19:52, Peter O'Connor пише:
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1)for iin range(1000)] smooth_signal = [average = (1-decay)*average + decay*xfor xin signalfrom average=0.]
Using currently supported syntax:
smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]]
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
3) The idea that an assignment operation "a = f()" returns a value (a) is already consistent with the "chained assignment" syntax of "b=a=f()" (which can be thought of as "b=(a=f())"). I don't know why we feel the need for new constructs like "(a:=f())" or "(f() as a)" when we could just think of assignments as returning values (unless that breaks something that I'm not aware of)
Consider
if x = 1: print("What did I just do?")
Ah, ok, I suppose that could easily lead to typo-bugs. Ok, then I agree that "a:=f()" returning a is better On Fri, Apr 6, 2018 at 10:53 AM, Eric Fahlgren <ericfahlgren@gmail.com> wrote:
On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor <peter.ed.oconnor@gmail.com
wrote:
3) The idea that an assignment operation "a = f()" returns a value (a) is already consistent with the "chained assignment" syntax of "b=a=f()" (which can be thought of as "b=(a=f())"). I don't know why we feel the need for new constructs like "(a:=f())" or "(f() as a)" when we could just think of assignments as returning values (unless that breaks something that I'm not aware of)
Consider
if x = 1: print("What did I just do?")
On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Hi all, thank you for the feedback. I laughed, I cried, and I learned.
You'll be a language designer yet. :-)
However, it looks like I'd be fighting a raging current if I were to try and push this proposal. It's also encouraging that most of the work would be done anyway if ("Statement Local Name Bindings") thread passes. So some more humble proposals would be:
1) An initializer to itertools.accumulate functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate
See if that's accepted in the bug tracker.
2) Assignment returns a value (basically what's already in the "Statement local name bindings" discussion) `a=f()` returns a value of a This would allow updating variables in a generator (I don't see the need for ":=" or "f() as a") but that's another discussion
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons. -- --Guido van Rossum (python.org/~guido)
I'm not sure if my suggestion for 572 has been considered: ``name! expression`` I'm curious what the pros and cons of this form would be (?). My arguments for were in a previous message but there do not seem to be any responses to it. Cammil On Fri, 6 Apr 2018, 16:14 Guido van Rossum, <guido@python.org> wrote:
On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor <peter.ed.oconnor@gmail.com
wrote:
Hi all, thank you for the feedback. I laughed, I cried, and I learned.
You'll be a language designer yet. :-)
However, it looks like I'd be fighting a raging current if I were to try and push this proposal. It's also encouraging that most of the work would be done anyway if ("Statement Local Name Bindings") thread passes. So some more humble proposals would be:
1) An initializer to itertools.accumulate functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate
See if that's accepted in the bug tracker.
2) Assignment returns a value (basically what's already in the "Statement local name bindings" discussion) `a=f()` returns a value of a This would allow updating variables in a generator (I don't see the need for ":=" or "f() as a") but that's another discussion
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons.
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Seems to me it's much more obvious that "name:=expression" is assigning expression to name than "name!expression". The ! is also confusing because "!=" means "not equals", so the "!" symbol is already sort of associated with "not" On Fri, Apr 6, 2018 at 11:27 AM, Cammil Taank <ctaank@gmail.com> wrote:
I'm not sure if my suggestion for 572 has been considered:
``name! expression``
I'm curious what the pros and cons of this form would be (?).
My arguments for were in a previous message but there do not seem to be any responses to it.
Cammil
On Fri, 6 Apr 2018, 16:14 Guido van Rossum, <guido@python.org> wrote:
On Fri, Apr 6, 2018 at 7:47 AM, Peter O'Connor < peter.ed.oconnor@gmail.com> wrote:
Hi all, thank you for the feedback. I laughed, I cried, and I learned.
You'll be a language designer yet. :-)
However, it looks like I'd be fighting a raging current if I were to try and push this proposal. It's also encouraging that most of the work would be done anyway if ("Statement Local Name Bindings") thread passes. So some more humble proposals would be:
1) An initializer to itertools.accumulate functools.reduce already has an initializer, I can't see any controversy to adding an initializer to itertools.accumulate
See if that's accepted in the bug tracker.
2) Assignment returns a value (basically what's already in the "Statement local name bindings" discussion) `a=f()` returns a value of a This would allow updating variables in a generator (I don't see the need for ":=" or "f() as a") but that's another discussion
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons.
-- --Guido van Rossum (python.org/~guido) _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Apr 06, 2018 at 03:27:45PM +0000, Cammil Taank wrote:
I'm not sure if my suggestion for 572 has been considered:
``name! expression``
I'm curious what the pros and cons of this form would be (?).
I can't see any pros for it. In what way is ! associated with assignment or binding? It might as well be a arbitrary symbol. (Yes, I know that ultimately *everything* is an arbitrary symbol, but some of them have very strong associations built on years or decades or centuries of usage.) As Peter says, ! is associated with negation, as in !=, and to those of us with a maths background, name! simply *screams* "FACTORIAL" at the top of its voice.
My arguments for were in a previous message but there do not seem to be any responses to it.
Care to repeat those arguments? -- Steve
Care to repeat those arguments?
Indeed. *Minimal use of characters* The primary benefit for me would be the minimal use of characters, which within list comprehensions I think is not an insignificant benefit: stuff = [[(f(x) as y), x/y] for x in range(5)] # seems quite syntactically busy stuff = [[y := f(x), x/y] for x in range(5)] # better stuff = [[y! f(x), x/y] for x in range(5)] # two fewer characters (if you include the space after the identifier) *Thoughts on odd usage of "!"* In the English language, `!` signifies an exclamation, and I am imagining a similar usage to that of introducing something by its name in an energetic way. For example a boxer walking in to the ring: "Muhammed_Ali! <in walks Muhammed Ali>", "x! get_x()" I get that `!` is associated with "not", and factorial, but I couldn't think of another character already used that would work in this usage. I also think `name! expression` would be hard to interpret as a comparison or factorial. I suppose the trade off here is efficiency vs. idiosyncrasy. I very much appreciate this is all very tentative, but I wanted to explain why this syntax does not sit terribly with me. Cammil On 7 April 2018 at 00:49, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Apr 06, 2018 at 03:27:45PM +0000, Cammil Taank wrote:
I'm not sure if my suggestion for 572 has been considered:
``name! expression``
I'm curious what the pros and cons of this form would be (?).
I can't see any pros for it. In what way is ! associated with assignment or binding? It might as well be a arbitrary symbol.
(Yes, I know that ultimately *everything* is an arbitrary symbol, but some of them have very strong associations built on years or decades or centuries of usage.)
As Peter says, ! is associated with negation, as in !=, and to those of us with a maths background, name! simply *screams* "FACTORIAL" at the top of its voice.
My arguments for were in a previous message but there do not seem to be any responses to it.
Care to repeat those arguments?
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 07/04/18 09:54, Cammil Taank wrote:
Care to repeat those arguments?
Indeed.
*Minimal use of characters*
Terseness is not necessarily a virtue. While it's good not to be needlessly verbose, Python is not Perl and we are not trying to do everything on one line. Overly terse code is much less readable, as all the obfustication competitions demonstrate. I'm afraid I count this one *against* your proposal.
*Thoughts on odd usage of "!"*
In the English language, `!` signifies an exclamation, and I am imagining a similar usage to that of introducing something by its name in an energetic way. For example a boxer walking in to the ring:
"Muhammed_Ali! <in walks Muhammed Ali>", "x! get_x()"
I'm afraid that's a very personal interpretation. In particular, '!' normally ends a sentence very firmly, so expecting the expression to carry on is a little counter-intuitive. For me, my expectations of '!' run roughly as: * factorial (from my maths degree) * array dereference (because I am old: a!2 was the equivalent of a[2] in BCPL) * an exclamation, much overused in writing * the author was bitten by Yahoo! at an early age. -- Rhodri James *-* Kynesim Ltd
On 09/04/18 11:52, Rhodri James wrote:
On 07/04/18 09:54, Cammil Taank wrote:
Care to repeat those arguments?
Indeed.
*Minimal use of characters*
Terseness is not necessarily a virtue. While it's good not to be needlessly verbose, Python is not Perl and we are not trying to do everything on one line. Overly terse code is much less readable, as all the obfustication competitions demonstrate. I'm afraid I count this one *against* your proposal.
*Thoughts on odd usage of "!"*
In the English language, `!` signifies an exclamation, and I am imagining a similar usage to that of introducing something by its name in an energetic way. For example a boxer walking in to the ring:
"Muhammed_Ali! <in walks Muhammed Ali>", "x! get_x()"
I'm afraid that's a very personal interpretation. In particular, '!' normally ends a sentence very firmly, so expecting the expression to carry on is a little counter-intuitive. For me, my expectations of '!' run roughly as:
* factorial (from my maths degree) * array dereference (because I am old: a!2 was the equivalent of a[2] in BCPL) * an exclamation, much overused in writing * the author was bitten by Yahoo! at an early age.
Also logical negation in C-like languages, of course. Sorry, I'm a bit sleep-deprived this morning. -- Rhodri James *-* Kynesim Ltd
On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote:
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons.
Where has that discussion moved to? The threads on python-ideas seem to have gone quiet, and the last I heard you said that you, Chris and Nick were discussing some issues privately. -- Steve
On Sat, Apr 7, 2018 at 9:50 AM, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote:
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons.
Where has that discussion moved to? The threads on python-ideas seem to have gone quiet, and the last I heard you said that you, Chris and Nick were discussing some issues privately.
I'm still working on getting some code done, and I'm stuck due to a lack of time on my part. It'll likely move forward this weekend, and if I can do what I'm trying to do, I'll have a largely rewritten PEP to discuss. (Never call ANYTHING "trivial" or "simple" unless you already know the solution to it. Turns out that there are even more subtleties to "make it behave like assignment" than I had thought.) ChrisA
On 7 April 2018 at 09:50, Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Apr 06, 2018 at 08:06:45AM -0700, Guido van Rossum wrote:
Please join the PEP 572 discussion. The strongest contender currently is `a := f()` and for good reasons.
Where has that discussion moved to? The threads on python-ideas seem to have gone quiet, and the last I heard you said that you, Chris and Nick were discussing some issues privately.
Yeah, there were some intersecting questions between "What's technically feasible in CPython?" and "What stands even a remote chance of being accepted as a language change?" that Guido wanted to feed into the next iteration on the PEP, but were getting lost in the "Class scopes do what now?" subthreads on here. The next PEP update will have a lot more details on the related rationale, but the gist of what's going to change at the semantic level is: * the notion of hidden sublocal scopes is going away, so folks will need to use "del" or nested scopes to avoid unwanted name bindings at class and module scope (similar to iteration variables in for loops), but the proposed feature should be much easier to explain conceptually * comprehensions and generator expressions will switch to eagerly capturing referenced names from the scope where they're defined in order to eliminate most of their current class body scoping quirks (this does introduce some new name resolution quirks related to comprehensions-inside-regular-loops, but they'll at least be consistent across different scope types) * as a result of the name capturing change, the evaluation of the outermost expression in comprehensions and generator expressions can be moved inside the nested scope (so any name bindings there won't leak either) (At a syntactic level, the proposed spelling will also be switching to "name := expr") There will still be plenty of open design questions to discuss from that point, but it's a big enough shift from the previous draft that it makes sense to wait until Chris has a sufficiently complete reference implementation for the revised semantics to be confident that we can make things work the way the revised PEP proposes. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 6 April 2018 at 02:52, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Combined with the new "last" builtin discussed in the proposal, this would allow u to replace "reduce" with a more Pythonic comprehension-style syntax.
I think this idea was overshadowed by the larger syntactic proposal in the rest of your email (I know I missed it initially and only noticed it in the thread subject line later). With the increased emphasis on iterators and generators in Python 3.x, the lack of a simple expression level equivalent to "for item in iterable: pass" is occasionally irritating, especially when demonstrating behaviour at the interactive prompt. Being able to reliably exhaust an iterator with "last(iterable)" or "itertools.last(iterable)" would be a nice reduction function to offer, in addition to our existing complement of builtin reducers like sum(), any() and all(). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Instead of:
def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05))
I wrote in this mail list the very same proposal some time ago. I was
On 5 April 2018 at 13:52, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote: trying to let the scan higher order function (itertools.accumulate with a lambda, or what was done in the example above) fit into a simpler list comprehension. As a result, I wrote this project, that adds the "scan" feature to Python comprehensions using a decorator that performs bytecode manipulation (and it had to fit in with a valid Python syntax): https://github.com/danilobelli ni/pyscanprev In that GitHub page I've wrote several examples and a rationale on why this would be useful. -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
Hi Danilo, The idea of decorating a function to show that the return variables could be fed back in in a scan form is interesting and could solve my problem in a nice way without new syntax. I looked at your code but got a bit confused as to how it works (there seems to be some magic where the decorator injects the scanned variable into the namespace). Are you able to show how you'd implement the moving average example with your package? I tried: @enable_scan("average") def exponential_moving_average_pyscan(signal, decay, initial=0): yield from ((1-decay)*(average or initial) + decay*x for x in signal) smooth_signal_9 = list(exponential_moving_average_pyscan(signal, decay=decay))[1:] Which almost gave the right result, but seemed to get the initial conditions wrong. - Peter On Sat, Apr 14, 2018 at 3:57 PM, Danilo J. S. Bellini < danilo.bellini@gmail.com> wrote:
On 5 April 2018 at 13:52, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Instead of:
def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05))
I wrote in this mail list the very same proposal some time ago. I was trying to let the scan higher order function (itertools.accumulate with a lambda, or what was done in the example above) fit into a simpler list comprehension.
As a result, I wrote this project, that adds the "scan" feature to Python comprehensions using a decorator that performs bytecode manipulation (and it had to fit in with a valid Python syntax): https://github.com/danilobellini/pyscanprev
In that GitHub page I've wrote several examples and a rationale on why this would be useful.
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
In any case, although I find the magic variable-injection stuff quite strange, I like the decorator. Something like @scannable(average=0) # Wrap function so that it has a "scan" method which can be used to generate a stateful scan object def exponential_moving_average(average, x, decay): return (1-decay)*average + decay*x stateful_func = exponential_moving_average.scan(average=initial) smooth_signal = [stateful_func(x) for x in signal] Seems appealing because it allows you to define the basic function without, for instance, assuming that decay will be constant. If you wanted dynamic decay, you could easily have it without changing the function: stateful_func = exponential_moving_average.scan(average=initial) smooth_signal = [stateful_func(x, decay=decay) for x, decay in zip(signal, decay_schedule)] And you pass around state explicitly. On Mon, Apr 16, 2018 at 9:49 AM, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Hi Danilo,
The idea of decorating a function to show that the return variables could be fed back in in a scan form is interesting and could solve my problem in a nice way without new syntax.
I looked at your code but got a bit confused as to how it works (there seems to be some magic where the decorator injects the scanned variable into the namespace). Are you able to show how you'd implement the moving average example with your package?
I tried:
@enable_scan("average") def exponential_moving_average_pyscan(signal, decay, initial=0): yield from ((1-decay)*(average or initial) + decay*x for x in signal)
smooth_signal_9 = list(exponential_moving_average_pyscan(signal, decay=decay))[1:]
Which almost gave the right result, but seemed to get the initial conditions wrong.
- Peter
On Sat, Apr 14, 2018 at 3:57 PM, Danilo J. S. Bellini < danilo.bellini@gmail.com> wrote:
On 5 April 2018 at 13:52, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
I was thinking it would be nice to be able to encapsulate this common type of operation into a more compact comprehension.
I propose a new "Reduce-Map" comprehension that allows us to write:
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = [average = (1-decay)*average + decay*x for x in signal from average=0.]
Instead of:
def exponential_moving_average(signal: Iterable[float], decay: float, initial_value: float=0.): average = initial_value for xt in signal: average = (1-decay)*average + decay*xt yield average
signal = [math.sin(i*0.01) + random.normalvariate(0, 0.1) for i in range(1000)] smooth_signal = list(exponential_moving_average(signal, decay=0.05))
I wrote in this mail list the very same proposal some time ago. I was trying to let the scan higher order function (itertools.accumulate with a lambda, or what was done in the example above) fit into a simpler list comprehension.
As a result, I wrote this project, that adds the "scan" feature to Python comprehensions using a decorator that performs bytecode manipulation (and it had to fit in with a valid Python syntax): https://github.com/danilobellini/pyscanprev
In that GitHub page I've wrote several examples and a rationale on why this would be useful.
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On 16 April 2018 at 10:49, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Are you able to show how you'd implement the moving average example with your package?
Sure! The single pole IIR filter you've shown is implemented here: https://github.com/danilobellini/pyscanprev/blob/master/examples/iir-filter.... I tried:
@enable_scan("average") def exponential_moving_average_pyscan(signal, decay, initial=0): yield from ((1-decay)*(average or initial) + decay*x for x in signal)
smooth_signal_9 = list(exponential_moving_average_pyscan(signal, decay=decay))[1:]
Which almost gave the right result, but seemed to get the initial conditions wrong.
I'm not sure what you were expecting. A sentinel as the first "average" value? Before the loop begins, this scan-generator just echoes the first input, like itertools.accumulate. That is, the first value this generator yields is the first "signal" value, which is then the first "average" value. To put an initial memory state, you should do something like this (I've removed the floating point trailing noise):
from pyscanprev import enable_scan, prepend
@enable_scan("y") def iir_filter(signal, decay, memory=0): ... return ((1 - decay) * y + decay * x for x in prepend(memory, signal)) ... list(iir_filter([1, 2, 3, 2, 1, -1, -2], decay=.1, memory=5)) [5, 4.6, 4.34, 4.206, 3.9854, 3.68686, 3.218174, 2.6963566]
In that example, "y" is the "previous result" (a.k.a. accumulator, or what had been called "average" here). -- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
To give this old horse a kick: The "given" syntax in the recent thread could give a nice solution for the problem that started this thread. Instead of my proposal of: smooth_signal = [average := (1-decay)*average + decay*x for x in signal from average=0.] We could use given for both the in-loop variable update and the variable initialization: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal] given average=0. This especially makes sense for the extended syntax, where my proposal of: (z, y := f(z, x) -> y for x in iter_x from z=initial_z) Becomes: (y given z, y = f(z, x) for x in iter_x) given z=initial_z So in stead of adding 2 symbols and a keyword, we just need to add the one "given" keyword. It's worth noting, as Serhiy pointed out, that this is already supported in python, albeit with a very clunky syntax: smooth_signal = [average for average in [0] for x in signal for average in [(1-decay)*average + decay*x]] (y for z in [initial_z] for x in iter_x for z, y in [f(z, x)]) On Tue, Apr 17, 2018 at 12:02 AM, Danilo J. S. Bellini < danilo.bellini@gmail.com> wrote:
On 16 April 2018 at 10:49, Peter O'Connor <peter.ed.oconnor@gmail.com> wrote:
Are you able to show how you'd implement the moving average example with your package?
Sure! The single pole IIR filter you've shown is implemented here: https://github.com/danilobellini/pyscanprev/blob/ master/examples/iir-filter.rst
I tried:
@enable_scan("average") def exponential_moving_average_pyscan(signal, decay, initial=0): yield from ((1-decay)*(average or initial) + decay*x for x in signal)
smooth_signal_9 = list(exponential_moving_average_pyscan(signal, decay=decay))[1:]
Which almost gave the right result, but seemed to get the initial conditions wrong.
I'm not sure what you were expecting. A sentinel as the first "average" value?
Before the loop begins, this scan-generator just echoes the first input, like itertools.accumulate. That is, the first value this generator yields is the first "signal" value, which is then the first "average" value.
To put an initial memory state, you should do something like this (I've removed the floating point trailing noise):
from pyscanprev import enable_scan, prepend
@enable_scan("y") def iir_filter(signal, decay, memory=0): ... return ((1 - decay) * y + decay * x for x in prepend(memory, signal)) ... list(iir_filter([1, 2, 3, 2, 1, -1, -2], decay=.1, memory=5)) [5, 4.6, 4.34, 4.206, 3.9854, 3.68686, 3.218174, 2.6963566]
In that example, "y" is the "previous result" (a.k.a. accumulator, or what had been called "average" here).
-- Danilo J. S. Bellini --------------- "*It is not our business to set up prohibitions, but to arrive at conventions.*" (R. Carnap)
On Thu, May 24, 2018 at 02:06:03PM +0200, Peter O'Connor wrote:
To give this old horse a kick: The "given" syntax in the recent thread could give a nice solution for the problem that started this thread.
Your use-case is one of the motivating examples for PEP 572. Unless I'm confused, your use-case is intentionally left out of Nick's "given" proposal. He doesn't want to support your example. (Nick, please correct me if I'm mistaken.)
Instead of my proposal of: smooth_signal = [average := (1-decay)*average + decay*x for x in signal from average=0.]
This would become: average = 0 smooth_signal = [average := (1-decay)*average + decay*x for x in signal] under PEP 572. If you insist on a one-liner (say, to win a bet), you could abuse the "or" operator: smooth_signal = (average := 0) or [average := (1-decay)*average + decay*x for x in signal] but I think that's the sort of example that people who dislike this proposal are worried about so please don't do that in serious code :-)
We could use given for both the in-loop variable update and the variable initialization: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal] given average=0.
I don't think that will work under Nick's proposal, as Nick does not want assignments inside the comprehension to be local to the surrounding scope. (Nick, please correct me if I'm wrong.) So in your example, the OUTER "given" creates a local variable in the current scope, average=0, but the INNER "given" inside the comprehension exists inside a separate, sub-local comprehension scope, where you will get an UnboundLocalError when it tries to evaluate (1-decay)*average the first time. [...]
So in stead of adding 2 symbols and a keyword, we just need to add the one "given" keyword.
PEP 572 will not only have the semantics you desire, but it requires only a single new symbol. If Nick writes up "given" as a PEP, I expect that it won't help your use-case. -- Steve
[Peter O'Connor]
... We could use given for both the in-loop variable update and the variable initialization: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal] given average=0.
[Steven D'Aprano <steve@pearwood.info>]
I don't think that will work under Nick's proposal, as Nick does not want assignments inside the comprehension to be local to the surrounding scope. (Nick, please correct me if I'm wrong.)
Nick appears to have moved on from "given" to more-general augmented assignment expressions. See PEP 577, but note that it's still a work-in-progress: https://github.com/python/peps/pull/665 Under that PEP, average = 0 smooth_signal = [(average := (1-decay)*average + decay*x) for x in signal] Or, for the running sums example: total = 0 sums = [(total += x) for x in data] I'm not entirely clear on whether the "extra" parens are needed, so added 'em anyway to make grouping clear.
On 26 May 2018 at 04:14, Tim Peters <tim.peters@gmail.com> wrote:
[Peter O'Connor]
... We could use given for both the in-loop variable update and the variable initialization: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal] given average=0.
[Steven D'Aprano <steve@pearwood.info>]
I don't think that will work under Nick's proposal, as Nick does not want assignments inside the comprehension to be local to the surrounding scope. (Nick, please correct me if I'm wrong.)
Nick appears to have moved on from "given" to more-general augmented assignment expressions.
Aye, while I still don't want comprehensions to implicitly create new locals in their parent scope, I've come around on the utility of letting inline assignment targets be implicitly nonlocal references to the nearest block scope.
See PEP 577, but note that it's still a work-in-progress:
https://github.com/python/peps/pull/665
Under that PEP,
average = 0 smooth_signal = [(average := (1-decay)*average + decay*x) for x in signal]
Or, for the running sums example:
total = 0 sums = [(total += x) for x in data]
I'm not entirely clear on whether the "extra" parens are needed, so added 'em anyway to make grouping clear.
I think the parens would technically be optional (as in PEP 572), since "EXPR for" isn't legal syntax outside parentheses/brackets/braces, so the parser would terminate the assignment expression when it sees the "for" keyword. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Nick Coghlan wrote:
Aye, while I still don't want comprehensions to implicitly create new locals in their parent scope, I've come around on the utility of letting inline assignment targets be implicitly nonlocal references to the nearest block scope.
What if you're only intending to use it locally within the comprehension? Would you have to put a dummy assignment in the surrounding scope to avoid a NameError? That doesn't sound very nice. -- Greg
On 28 May 2018 at 10:17, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
Nick Coghlan wrote:
Aye, while I still don't want comprehensions to implicitly create new locals in their parent scope, I've come around on the utility of letting inline assignment targets be implicitly nonlocal references to the nearest block scope.
What if you're only intending to use it locally within the comprehension? Would you have to put a dummy assignment in the surrounding scope to avoid a NameError? That doesn't sound very nice.
The draft PEP discusses that - it isn't saying "Always have them raise TargetNameError, now and forever", it's saying "Have them raise TargetNameError in the first released iteration of the capability, so we can separate the discussion of binding semantics in scoped expressions from the discussion of declaration semantics". I still want to leave the door open to giving comprehensions and lambdas a way to declare and bind truly local variables, and that gets more difficult if we go straight to having the binding expressions they contain *implicitly* declare new variables in the parent scope (rather than only binding previously declared ones). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, May 24, 2018 at 2:49 PM, Steven D'Aprano <steve@pearwood.info> wrote:
On Thu, May 24, 2018 at 02:06:03PM +0200, Peter O'Connor wrote:
We could use given for both the in-loop variable update and the variable initialization: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal] given average=0.
So in your example, the OUTER "given" creates a local variable in the current scope, average=0, but the INNER "given" inside the comprehension exists inside a separate, sub-local comprehension scope, where you will get an UnboundLocalError when it tries to evaluate (1-decay)*average the first time.
You're right, having re-thought it, it seems that the correct way to write it would be to define both of them in the scope of the comprehension: smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal given average=0.] This makes sense and follows a simple rule: "B given A" just causes A to be executed before B - that holds true whether B is a variable or a loop declaration like "for x in x_gen". So a_gen = (g(a) given a=f(a, x) for x in x_gen given a=0) would be a compact form of: def a_gen_func(x_gen): a=0 for x in x_gen: a = f(a, x) yield g(a) a_gen = a_gen_func()
On 30/05/2018 17:05, Peter O'Connor wrote:
On Thu, May 24, 2018 at 2:49 PM, Steven D'Aprano <steve@pearwood.info <mailto:steve@pearwood.info>> wrote:
On Thu, May 24, 2018 at 02:06:03PM +0200, Peter O'Connor wrote: > We could use given for both the in-loop variable update and the variable > initialization: > smooth_signal = [average given average=(1-decay)*average + decay*x for > x in signal] given average=0.
So in your example, the OUTER "given" creates a local variable in the current scope, average=0, but the INNER "given" inside the comprehension exists inside a separate, sub-local comprehension scope, where you will get an UnboundLocalError when it tries to evaluate (1-decay)*average the first time.
You're right, having re-thought it, it seems that the correct way to write it would be to define both of them in the scope of the comprehension:
smooth_signal = [average given average=(1-decay)*average + decay*x for x in signal given average=0.]
This makes sense and follows a simple rule: "B given A" just causes A to be executed before B - that holds true whether B is a variable or a loop declaration like "for x in x_gen".
So
a_gen = (g(a) given a=f(a, x) for x in x_gen given a=0)
would be a compact form of:
def a_gen_func(x_gen): a=0 for x in x_gen: a = f(a, x) yield g(a) a_gen = a_gen_func()
[There is a typo here - a_gen_func is defined to take 1 argument but is called with none.] After - *I think* - understanding this, I would try to make the one-line clearer by parenthesizing it thus (whether or not the grammar required it): a_gen = ( ((g(a) given a=f(a, x)) for x in x_gen) given a=0) Even then, it would make my head spin if I came across it. I hope no-one would write code like that. I'm not keen on given, but I must admit that ISTM that this example shows something that can only be done with given: putting some initialisation, viz. "a=0", into a generator expression. With :=, it would need a trick: a_gen = (g( a:=f(a, x) ) for x in [x_gen, a:=0][0] ) or a_gen = (g( a:=f(a, x) ) for a in [0] for x in x_gen] ) Of course, in the case of a list comprehension (as opposed to a genexp), the initialisation could be done separately: a = 0 a_list = [g( a:=f(a, x) ) for x in x_gen] Rob Cliffe
participants (22)
-
Brendan Barnwell
-
Cammil Taank
-
Chris Angelico
-
Clint Hepner
-
Danilo J. S. Bellini
-
David Mertz
-
Eric Fahlgren
-
Ethan Furman
-
Greg Ewing
-
Guido van Rossum
-
Jacco van Dorp
-
Kyle Lahnakoski
-
Michel Desmoulin
-
Nick Coghlan
-
Paul Moore
-
Peter O'Connor
-
Rhodri James
-
Rob Cliffe
-
Serhiy Storchaka
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Tim Peters