[Python-ideas] Reduce/fold and scan with generator expressions and comprehensions
Danilo J. S. Bellini
danilo.bellini at gmail.com
Sun Oct 23 10:57:10 EDT 2016
The idea is to let generator expressions and list/set comprehensions have a
clean syntax to access its last output. That would allow them to be an
alternative syntax to the scan higher-order function [1] (today implemented
in the itertools.accumulate function), which leads to an alternative way to
write a fold/reduce. It would be nice to have something like:
>>> last(abs(prev - x) for x in [3, 4, 5] from prev = 2)
2
instead of a reduce:
>>> from functools import reduce
>>> reduce(lambda prev, x: abs(prev - x), [3, 4, 5], 2)
2
or an imperative approach:
>>> prev = 2
>>> for x in [3, 4, 5]:
... prev = abs(prev - x)
>>> prev
2
or getting the last from accumulate:
>>> from itertools import accumulate
>>> list(accumulate([2, 3, 4, 5], lambda prev, x: abs(prev - x)))[-1]
2
or...
>>> [prev for prev in [2]
... for x in [3, 4, 5]
... for prev in [abs(prev - x)]
... ][-1]
2
Actually, I already wrote a solution for something similar to that:
PyScanPrev [2]. I'm using bytecode manipulation to modify the generator
expression and set/list comprehensions semantics to create a "scan", but it
has the limitation of using only code with a valid syntax as the input, so
I can't use "from" inside a generator expression / list comprehension. The
solution was to put the first output into the iterable and define the
"prev" name elsewhere:
>>> last(abs(prev - x) for x in [2, 3, 4, 5])
2
That line works with PyScanPrev (on Python 3.4 and 3.5) when defined in a
function with a @enable_scan("prev") decorator. That was enough to create a
"test suite" of doctest-based examples that shows several scan use cases
[2].
This discussion started in a Brazilian list when someone asked how she
could solve a simple uppercase/lowercase problem [3]. The goal was to
alternate the upper/lower case of a string while neglecting the chars that
doesn't apply (i.e., to "keep the state" when the char isn't a letter). After
the discussion, I wrote the PyScanPrev package, and recently I've added
this historical "alternate" function as the "conditional toggling" example
[4].
Then I ask, can Python include that "scan" access to the last output in its
list/set/dict comprehension and generator expression syntax? There are
several possible applications for the scan itself as well as for the
fold/reduce (signal processing, control theory, physics, economics, etc.),
some of them I included as PyScanPrev examples. Some friends (people who
like control engineering and/or signal processing) liked the "State-space
model" example, where I included a "leaking bucket-spring-damper"
simulation using the scan-enabled generator expressions [5].
About the syntax, there are several ideas on how that can be written. Given
a "prev" identifier, a "target" identifier, an input "iterable" and an
optional "start" value (and perhaps an optional "echo_start", which I
assume True by default), some of them are:
[func(prev, target) for target in iterable from prev = start]
[func(prev, target) for target in iterable] -> prev = start
[func(prev, target) for target in iterable] -> prev as start
[func(prev, target) for target in iterable] from prev = start
[func(prev, target) for target in iterable] from prev as start
[func(prev, target) for target in iterable] with prev as start
prev = start -> [func(prev, target) for target in iterable]
prev(start) -> [func(prev, target) for target in iterable]
[func(prev, target) for prev -> target in start -> iterable]
[prev = start -> func(prev, target) for target in iterable]
# With ``start`` being the first value of the iterable, i.e.,
# iterable = prepend(start, data)
[func(prev, target) for target in iterable from prev]
[func(prev, target) for target in iterable] -> prev
[func(prev, target) for target in iterable] from prev
prev -> [func(prev, target) for target in iterable]
Before writing PyScanPrev, in [6] (Brazilian Portuguese) I used stackfull
[7] to implement that idea, an accumulator example using that library is:
>>> from stackfull import push, pop, stack
>>> [push(pop() + el if stack() else el) for el in range(5)]
[0, 1, 3, 6, 10]
>>> list(itertools.accumulate(range(5)))
[0, 1, 3, 6, 10]
There are more I can say (e.g. the pyscanprev.scan function has a "start"
value and an "echo_start" keyword argument, resources I missed in
itertools.accumulate) but the links below already have a lot of information.
[1] https://en.wikipedia.org/wiki/Prefix_sum#Scan_higher_order_function
[2] https://pypi.python.org/pypi/pyscanprev
[3] https://groups.google.com/forum/#!topic/grupy-sp/wTIj6G5_5S0
[4]
https://github.com/danilobellini/pyscanprev/blob/v0.1.0/examples/conditional-toggling.rst
[5]
https://github.com/danilobellini/pyscanprev/blob/v0.1.0/examples/state-space.rst
[6] https://groups.google.com/forum/#!topic/grupy-sp/UZp-lVSWK1s
[7] https://pypi.python.org/pypi/stackfull
--
Danilo J. S. Bellini
---------------
"*It is not our business to set up prohibitions, but to arrive at
conventions.*" (R. Carnap)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20161023/0e75a4ae/attachment.html>
More information about the Python-ideas
mailing list