[Python-ideas] Reduce/fold and scan with generator expressions and comprehensions

Danilo J. S. Bellini danilo.bellini at gmail.com
Wed Nov 2 14:14:16 EDT 2016


2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> My point is that although new syntax may be useful for simple cases,
> serious applications will worry about computational accuracy and
> likely will provide packages that handle general cases that nest these
> simple cases.  Given they exist, most modelers will prefer using those
> packages to writing their own comprehensions.  That may not apply in
> other fields, but AFAICS it does apply in economics.  So if you can't
> handle the complex cases, the syntax is just cognitive overhead:
> TOOWTDI will be the high-quality general packages even where they're
> theoretically overkill.


>
[...] Sure, you *can* express a second-order
> difference equation as two first-order equations, and perhaps you
> would actually calculate it that way.  But in economics we normally
> express a second-order diff eq as a second-order diff eq.  If the
> input is a second-order equation that needs to be reformulated as a
> system of first-order equations, then with the code required to
> implement such transformations, it is not obvious to me that having
> syntax for first-order equations is going to be worth the extra syntax
> in the language.  Most of the complexity is going to be in the
> transformation which AFAICS is likely to be problem-specific, and
> therefore unavoidable by the modeler.  OTOH, as PyScanPrev shows, the
> complexity of recursion can be hidden in a decorator, which the
> modeler can cargo-cult.
>

I'm not talking about "simple cases", but about quite general cases,
general in the sense that it allows modeling time varying models (and the
proposal is about recursion, whose computational power should be obvious).
What would be more general than a linear MIMO (multiple input, multiple
output) time varying state-space model? It's not so common to find some
package that works with time varying models. Which ones do you know?

Actually, the state-space equations (as used in a PyScanPrev example) is a
"modern" way of doing general [linear time varying] control engineering
using first-order equations and a n-dimensional state vector, as opposed to
the "classical" difference equations and its Z transform, which are the
core of my other package, AudioLazy:

https://pypi.python.org/pypi/audiolazy

You're right when you say that "serious applications will worry about
computational accuracy", indeed that was a reason I wrote the gammatone
filters in AudioLazy, the cascading filter model was a requirement to avoid
getting just amplified floating point quantization garbage as a result (as
used to happen when using a single call to scipy.signal.lfilter). A "rule
of thumb" in signal processing is to use cascading/parallel biquads instead
of higher order [recursive] filters directly due to the computational
accuracy (but that doesn't apply to comb filters, for example). Anyway,
that makes this talking about computational accuracy sound like an argument
for my proposal here, not against it.

I STRONGLY disagree with "will provide packages that handle general cases",
it simply doesn't make sense. At best, a package would just mirror what
happens in math: is there any general solution for general ODE? Packages
handles specific things, while syntax deals with the general stuff: once
you find something that a package can't do, you need to use something
different, using only what the language has (or another package, if there's
any available). There are packages that include converters of difference
equations from/to state-space equations, but these couldn't be used when
you're talking about time varying models. You can't use LTI tools for
something that isn't time-invariant.


2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> Fine, but I only questioned economics.  I'm not a rocket scientist,
> I'll let the rocket scientists question that.  If they don't, *I*
> certainly will concede you have a point in those other fields.


Ok. and I'm from a digital signal processing background (EE and a MSCS on
it), mainly for audio applications, though I work today in a fintech. But
we're not here to give just ad hominem / ad verecundiam pseudo-arguments (I
mean, we're not talking about me or you).

The simplest examples I've ever seen in digital control theory were about
economics, mainly because these models are useful and don't need any kind
of "continuous time to discrete time" conversion, as the time is already
discrete in the first place. I've created from scratch the leaking bucket
example mainly to avoid copying an economics example from the Wilson J.
Rugh "Linear Systems Theory" book in the PyScanPrev. If I understood you
correctly, you're saying that economics is simpler than I thought and
doesn't require anything beyond a basic LTI difference equation, and for
that restrict scenario there are packages that already do the job.

OTOH, if your point is that the scan higher order function / PyScanPrev and
stuff alike aren't enough for properly modeling every economical model
there is without some undesired adaptation for a subset of these models,
that makes some sense. I'm just saying it's useful for a lot of things,
including applications in economics. Does it need to be the Harry Potter
wand to be "accepted"?


2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

>  > This maillist isn't very inviting...
>
> Why do you say that?  Because people aren't falling over themselves to
> accept your proposal?

[...]
>
You're perfectly willing to tell other people what to read, though.  I
> realize times have changed since 1983, but as long as I've been on the
> 'net, it's always been considered polite to investigate previous
> discussions, and AFAIK it still is.
>

In the second part of this citation, you seem to realize the reason for
what I said. Do you trust anyone who gives a value judgement about
something he/she didn't read about without at least disclaiming about that
lack of reading?

It's not me who is telling other people to judge, but it's an idea I wrote
about, and that's the reason this maillist exists, isn't it? I firmly
believe "list comprehension" wouldn't be "accepted" today if it wasn't in
Python syntax yet.

Is there any previous discussion on this topic or something related to it?
Where's the link? (I asked it before) It's always been considered polite to
investigate before judging.


2016-10-25 12:02 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

>  > Perhaps there are people who prefer masochist rituals instead of
>  > using "reduce", who knows? Who cares?
>
> (1) There are.  (2) Evidently you don't.  (3) You should, because the
> leading (literally) person who prefers writing code himself to using
> "reduce" is Guido van Rossum.
>

Actually, that's an argument for this proposal, not against. I'm proposing
a way to avoid itertools.accumulate and functools.reduce using an explicit
syntax instead of the functions. Or does GvR prefer map/filter instead of
list comprehensions?

If you're right, then we should be consistent and propose the elimination
of list comprehension and every syntax alike. The arguments you're giving
against my proposal are the same.


2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve at pearwood.info>:

> > I don't know if that's a conclusion from any other thread, but that's
> > wrong. The only extra "freedom" required a way to access the previous
> > output (or "accumulator", "memory", "state"... you can use the name you
> > prefer, but they're all the same). How many parameters does
> itertools.scan
> > have? And map/filter?
>
> I do not know which email thread is being talked about here, but the
> conclusion is correct. In the most general case, you might want:
>
> - the current value of some running total or result;
> - the previous value of the running result;
> - the value of the running result before that ("previous previous");
> - and so on;
> - more than one running calculation at the same time;
> - the current sequence value (for x in [1, 2, 3]);
> - the previous sequence value (previous value of x);
> - the sequence value before that ("previous previous x");
> - etc.
>

Wrong.

1. The current value is already there, it's not something new.
2. The accumulator (previous result) is the only thing I'm talking about,
anything beyond that is also beyond my new syntax proposal
3. The "previous previous" and so on are just lag/windowing, I'm not
proposing a way to solve these, but my examples did that. Complaining about
that is a proof that my examples weren't read. We can talk about a syntax
to solve these as well but that's not what my proposal is about. That
includes the "previous sequence value" history.


2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve at pearwood.info>:
>
> Recurrence relations are not all linear, and they don't always involve
> only the previous value. Fibonacci numbers need to track two previous
> running values:
>
> F[n] = F[n-1] + F[n-2]


Did you know Fibonacci is an example in PyScanPrev since long ago?

And not all recurrence relations are time invariant, I gave an example of a
time varying recurrence relation.

Do you have any example of non-linear recurrence relation? Do you know how
to linearize it, or do you have a proof that it can't be linearized as a
time varying model? Why does it make any difference for this proposal?


2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve at pearwood.info>:

> And perhaps there are people who think that using "reduce" and other
> functional idioms instead of good, clean, easy-to-understand, easy-to-
> debug imperative code is a "masochist ritual".


If you're talking about people who are misusing "reduce", then you're
arguing for alternatives, and this proposal has an alternative as an
obvious corollary. Or do you think everyone should prefer nested code
instead of something more expression-oriented, though that's again
something in PEP20?

If you don't like "reduce", go ahead with the triple-for-sections. Good
luck on that. But that's not the point of my proposal, I just found
"reduce" is taboo, and that reminds me when I said all that matters here is
social status.


2016-10-25 13:00 GMT-02:00 Steven D'Aprano <steve at pearwood.info>:

> why do we need ANOTHER way of
> solving these problems when there are already so many? More use-cases
> doesn't answer that question


I already said: it's explicit and cleaner. Easier to memorize. Lower McCabe
complexity. Less tokens (not characters). Less repeated tokens when
compared with the triple-for-section comprehensions. And how about the AST
tree height, wouldn't it have the same of a common list comprehension?

But what would answer that question for plain common list comprehensions?
Does the same complaining apply, or you think that attacking map/filter are
the same as attacking list comprehensions, as you did here with my proposal
and reduce?

I think one should then propose that list comprehensions have to be
removed, just the keep the consistency. Oh, it won't be due to backwards
compatibility, I know.


2016-10-25 23:36 GMT-02:00 David Mertz <mertz at gnosis.cx>:

> On Tue, Oct 25, 2016 at 1:55 PM, Rob Cliffe <rob.cliffe at btinternet.com>
> wrote:
>>
>> >>> [prev * k for k in [5, 2, 4, 3] from prev = 1]
>> [1, 5, 10, 40, 120]
>>
>> That makes sense for me, and seem simpler than:
>>
>> >>> from itertools import accumulate, chain
>> >>> list(accumulate(chain([1], [5, 2, 4, 3]), lambda prev, k: prev * k))
>> [1, 5, 10, 40, 120]
>>
>> Well, if you want an instant reaction from someone skimming this thread:
>> I looked at the first example and couldn't understand it.
>>
>
> After reading every post in the thread, I still don't understand the
> proposed new syntax really.
>
> How does 'prev' get bound in the loop? Is it a new magic keyword for "last
> accumulated element?" Does the binding in the "from" clause magically say
> that it should get rebound in the loop where it's no longer mentioned? Why
> is the word `from` used here when there's no obvious relation to other uses?
>
> The alternate version looks much better if you don't try so hard to make
> it look bad.  The much more obvious spelling is:
>
> from operator import mul
>
> from itertools import accumulate, chain
>
> accumulate(chain([1], nums), mul)
>
>
> If you give up a fear of using `import` and stop arbitrarily converting a
> possibly infinite iterator to a concrete list, this form is extremely short
> and obvious.  Moreover, in the example, it is extra strange that the
> multiplicative identity is added into the front of the iterator.  This is
> exactly the same thing as simply spelling it:
>
> accumulate(nums, mul)
>
> Which is even shorter.  It's feels very artificially contrived to insist
> that the initial element must live somewhere other than the iterator
> itself.  But it would be easy enough to write a wrapper to massage an
> iterator for this special case:
>
> def prepend(item, it):
>
>     return itertools.chain([item], it)
>
>
> Doesn't save any characters, but the name might emphasize the intent.
> Maybe this implementation forces the point even more:
>
> def prepend(item, it):
>
>     yield item
>
>     yield from it
>
>
> --
> Keeping medicines from the bloodstreams of the sick; food
> from the bellies of the hungry; books from the hands of the
> uneducated; technology from the underdeveloped; and putting
> advocates of freedom in prisons.  Intellectual property is
> to the 21st century what the slave trade was to the 16th.
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 
Danilo J. S. Bellini
---------------
"*It is not our business to set up prohibitions, but to arrive at
conventions.*" (R. Carnap)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161102/6a0b93bd/attachment-0001.html>


More information about the Python-ideas mailing list