[Python-ideas] Reduce/fold and scan with generator expressions and comprehensions

Sun Nov 6 13:46:42 EST 2016

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> As for recursion, the syntax you proposed doesn't say "recursion" to
> me, it says "let's add more complexity to a syntax that already is
> hard to fit on a line so we can turn a sequence into a series."
>

1. The "fit on a line" is just another weird taboo:
1.1. An expression isn't a physical line.
1.2. Sub-expressions in an expression might be on other statements (e.g.
assignments, other functions).
1.4. Several of my examples in PyScanPrev aren't oneliners, including one
you pasted in an e-mail.
1.3. When you call itertools.accumulate, it's still an expression.
1.5. List comprehensions are expressions, including the triple-for-section
comprehensions with repeated target variable names.
1.6. Being anti-expression because of an anti-oneliner bias sounds like
moving towards an assembly language (using only atomic imperative commands)
or something merely bureaucratic.

2. The itertools.accumulate function signature is broken:
2.1. Its arguments are reversed when compared to other higher order
functions like map, filter and reduce.
2.2. It lacks a "start" parameter, requiring more complexity to include it
(nesting a itertools.chain or a "prepend" function call).

3. It's not about "adding complexity", it's the other way around:
3.1. The proposal is about explicit recursion in a list comprehension (a
way to access the previous output value/result, a.k.a.
accumulator/state/memory).
3.2. Today you can only do (3.1) using either:
3.2.1. A triple-for-section list comprehension with repeated target names
(as described in the PyScanPrev rationale section).
3.2.2. Functions for stack frame manipulation (like the example I gave in
this maillist using stackfull).
3.2.3. Bytecode manipulation (the PyScanPrev approach), or something alike
(e.g. AST manipulation).
3.2.4. Raw Python code pre-processing (e.g. by customizing some import
hooks).

Only 3.2.4 allows a new syntax.

I'm not sure what you meant with "turn a sequence into a series", that
sounds like the itertools.accumulate from Python 3.2, which didn't have a
function as a parameter.

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

>  > Anyway, that makes this talking about computational accuracy sound
>  > like an argument for my proposal here, not against it.
>
> Not as I see it.  My point is that functions like accumulate already
> get me as much of your proposal as I can see being useful in my own
> applications, and so I'd rather spend effort on the inherent
> complexity of accurate computation than on learning new syntax which
> as far as I can see buys me no extra simplicity.
>

What do you mean by the word "accuracy"? IMHO, now you're not talking about
accuracy anymore. It's an argument like "map/filter and generator
expressions are the same" again. That's just an argument against generator
expressions and list/set/dict comprehensions in general, as they "already
get us as much of" map/filter.

About the effort, do you really find the examples below with the new
proposed syntax difficult to understand?

>>> # Running product
>>> [prev * k for k in [5, 2, 4, 3] from prev = 1]
[1, 5, 10, 40, 120]

>>> # Accumulate (prefix sum / cumulative sum)
>>> [prev + k for k in [5, 2, 4, 3] from prev = 0]
[0, 5, 7, 11, 14]

>>> # Pairs to be used in the next example (nothing new here)
>>> pairs = [(a, b) for a in [0, 1, 2, 3] for b in [-2, 2] if b != a]
>>> pairs
[(0, -2), (0,2), (1, -2), (1, 2), (2, -2), (3, -2), (3, 2)]

>>> # The recursive list comprehension with the new syntax
>>> [b + prev * a for a, b in pairs from prev = 0]
[0, -2, 2, 0, 2, 2, 4, 14]

>>> # The same with itertools.accumulate (for comparison)
>>> from itertools import accumulate, chain
>>> list(accumulate(chain([0], pairs),
...                 lambda prev, pair: pair[1] + prev * pair[0]))
[0, -2, 2, 0, 2, 2, 4, 14]

>>> # The same in a single expression using the new syntax
>>> [b + prev * a for a in [0, 1, 2, 3]
...               for b in [-2, 2]
...               if b != a
...               from prev = 0]
[0, -2, 2, 0, 2, 2, 4, 14]

Everything needs some effort to be understood, some effort to be memorized,
some effort to be maintained, etc.. But Rob Cliffe could understand the
running product when he had read this thread and disclaimed that he never
heard about accumulate. For someone who doesn't know what
accumulate/scan/fold/reduce is, I think the proposed syntax is more
descriptive/explicit/instructional/didactic/maintainable.

Please read the "Rationale" section and the "Comparison" example in
PyScanPrev. Also, please keep in mind that even with bytecode manipulation,
PyScanPrev is still limited to valid Python syntax, and the syntax proposed
here is different and more general (e.g. the last example in this e-mail
can't be done in PyScanPrev as a single list comprehension).

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> Consider the existing comprehension syntax.  I use it all the time
> because it's very expressive, it "looks like" what I would write in a
> declarative mathematical definition, and the constructive alternative
> is not a function call, it's a loop, in many cases with a nested if
> test as well.  However, when you try to express more than one loop
> with a comprehension, you end up with a counterintuitive ordering
> based on the current "this expands naturally to for loops and if
> conditionals according to this simple rule" syntax, along with a
> separation of the result expression from the loop where it's actually
> produced.  It's not obvious to me that use of comprehension syntax
> beyond [ f(x) for x in y if g(x) ] is all that useful, but that very
> basic form is extremely useful by itself.
>

And what I'm proposing is a very expressive syntax that "looks like" what I
would write in a declarative mathematical definition. The state-space
example is really explicit on this. Also, it's something VERY useful, I've
written several examples and I know they aren't exhaustive.

Counterintuitive ordering in list comprehensions? More than one loop? You
don't like the multiple for sections cartesian product in a list
comprehension, as it seems. That's not the point in this proposal, anyway.

In your example, we can use either the generator expression (f(x) for x in
y if g(x)) or the point-free map(f, filter(g, y)). Why do you prefer the
first?

You said "it's not a function call, it's a loop", but at the end you wrote
"f(x)" and "g(x)". Maybe that's why you didn't see the recursion: it's
tail-recursive / stackless. Say we have:

[f(p, x) for x in y from p = z]

The result for y = [y0, y1, y2, ...] would be:

[z,
 f(z, y0),
 f(f(z, y0), y1),
 f(f(f(z, y0), y1), y2),
 ...]

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> If your proposal will help make more complex comprehensions easy to
> understand, then you've got something I'd like to have.  But I don't
> yet see how it fixes that, and if not, I ain't gonna need it and I
> don't want to have to explain it to my students when they ain't gonna
> need it either.
>

Nice to know. The proposed:

>>> [f(p, x) for x in y from p = z]

Should return the same from this, if one insists on using a list
comprehension today without the new syntax:

>>> (lambda start: [start] + [p for p in [start]
...                             for x in y
...                             for p in [f(p, x)]]
... )(z)

The proposed comprehension is easier/simpler than that in every criterion
I'm aware of.

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> itertools functions are widely used and generally well-thought-of.
> Avoiding accumulate is a non-goal AFAIK.  AIUI, Guido does think that
> avoiding reduce is a goal, but I believe that's because he thinks the
> semantics are just plain hard to understand.  If your syntax is in
> large part a substitute for reduce, I doubt he will like it more than
> reduce just because it's syntax.  (I do not speak for Guido; I offer
> my observation in the hope it will help you prepare to present your
> idea to him in the most favorable light.)
>

I doubt Guido is against solving problems that requires a fold to be
solved. AFAIK he's against the "reduce" higher order function itself, not
against what it does. Mixing that is like acting against list
comprehensions just because Guido doesn't seem to like the map/filter
higher order functions. Guido either prefers what exists or an alternative,
I'm pretty sure he wouldn't ever say "solving this kind of problem is
forbidden in Python".

And that's again an argument for my proposal, as getting a fold from a scan
is obvious: it's just the last result. Using reduce, let's get the last
value after scanning through the "pairs" I defined in a previous example in
this e-mail:

>>> # The last value with reduce (Python 2 only, for comparison)
>>> reduce(lambda prev, (a, b): b + prev * a, pairs, 0)
14
>>> # ... and with functools.reduce (Python 2 and 3)
>>> from functools import reduce
>>> reduce(lambda prev, pair: pair[1] + prev * pair[0], pairs, 0)
14

Actually, using a [:-1] slice or iterating until we get the last element in
the previous [0, -2, 2, 0, 2, 2, 4, 14] result would be way more explicit
(and wouldn't require the "reduce" function itself). I've proposed a "last"
function to do that, but that's another point, not so required in the sense
that one can write it as a function elsewhere.

In Python 2.7, it's worth noting that reduce doesn't require the import,
and lambdas are more powerful as they allow the "prev, (a, b)" signature.
This "unpacking" syntax was lost in Python 3 but not when we're dealing
with "for" loops/comprehensions. Something similar could be said about
itertools.accumulate.

2016-11-03 8:10 GMT-02:00 Stephen J. Turnbull <turnbull.stephen.fw at u.
tsukuba.ac.jp>:

> That is, comprehensions are clearly useful, and I use them in almost
> every program I write.  I still don't see when I'd ever strongly
> prefer the syntax you propose to the itertools we already have, so I
> don't care if it's a consistent extension.  That doesn't mean it's not
> useful to you or others, it just means I'm not convinced -- and so I
> will not join the proponents.  That's all.
>

That's not really about consistency, it sounds more like a "Don't touch!".
There are 2 consistencies we're talking about here: API consistency and
argumentation consistency. APIs can have backward compatibility issues and
legacy code, which are sources of inconsistency. OTOH, inconsistency in
argumentation is either a mistake or a deceitful behavior.

For me, the idea I'm proposing is clearly useful. I wanted to use it last
friday to create a function that auto-indents some text based on some
simple rules, and the resulting indentation is the "accumulation" (scan) of
indent/dedent markers. But I couldn't even use itertools.accumulate,
because that project requires Python 2.7.

I need a really strong reason to migrate to Python 3 a proprietary legacy
code that isn't mine, and any "anti-functional" / "anti-expression" bias is
a strong reason not to do so. Also, it's worth noting that the "functional"
package in PyPI doesn't work on Python 3. Stuff like the f-strings are
really nice, I can't count how many times I wrote
some_string_literal.format(**locals()), but replacing all lambda signatures
by cryptic indexing or naming them elsewhere still doesn't worth it (Would
you prefer to replace every single list comprehension body by a named
function defined with "def"? If not, why others should?). I'm proposing
something that would make Python 3 more inviting, and not only for people
coming from a functional programming background.

-- 
Danilo J. S. Bellini
---------------
"*It is not our business to set up prohibitions, but to arrive at
conventions.*" (R. Carnap)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161106/d3288f1e/attachment-0001.html>