map, filter, reduce methods for generators
Hi! Sorry if someone has already talked about it (my simple search did not show any results). What do you think about adding map, flatmap, filter and reduce methods to generator type ? I must admit I've seen and I like Java 8 notation and I think it might be more readable than Python way in a few occasions. I would like to be able to write: range(100).\ filter( f1 ).\ map( f2 ).\ filter( f3 ).\ map( f4 ).\ reduce(operator.add) in addition to current Pythonic way of reduce( operator.add, f4(x) for x in ( f2(y) for y in range(100) if f1(y)) if f3(x) ) Though longer - Java way seems to be a bit more readable as the notation follows the data flow sequence. Do you think it is worth a PEP? BR, Jacek
I think it's a great idea that probably won't have much uptake on the list; FWIW in Scala you'd write // Sum of the squares of all odd numbers up to a hundred *(0 until 100).filter(_ % 2 == 1)* * .map(math.pow(_, 2))* * .reduce(_ + _)* But method chaining isn't really a thing in the python world, and people don't seem to like it as much as I do. On Thu, Apr 10, 2014 at 2:12 PM, Jacek Pliszka <jacek.pliszka@gmail.com>wrote:
Hi!
Sorry if someone has already talked about it (my simple search did not show any results).
What do you think about adding map, flatmap, filter and reduce methods to generator type ?
I must admit I've seen and I like Java 8 notation and I think it might be more readable than Python way in a few occasions.
I would like to be able to write:
range(100).\ filter( f1 ).\ map( f2 ).\ filter( f3 ).\ map( f4 ).\ reduce(operator.add)
in addition to current Pythonic way of
reduce( operator.add, f4(x) for x in ( f2(y) for y in range(100) if f1(y)) if f3(x) )
Though longer - Java way seems to be a bit more readable as the notation follows the data flow sequence.
Do you think it is worth a PEP?
BR,
Jacek
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
OK, if method chaining is a problem, what about: from StreamAlgebra import Map, Filter, Reduce from operator import add range(100) | Filter(f1) | Map(f2) | Filter(f3) | Map(f4) | Reduce(add) BR, Jacek
On 10 April 2014 23:09, Jacek Pliszka <jacek.pliszka@gmail.com> wrote:
OK, if method chaining is a problem, what about:
from StreamAlgebra import Map, Filter, Reduce from operator import add
range(100) | Filter(f1) | Map(f2) | Filter(f3) | Map(f4) | Reduce(add)
It's not so much that these types of things are "a problem" - more that they are simply not common usage in Python, and so are unfamiliar. You can certainly write code that uses method chaining, or your StreamAlgebra module. They are entirely legitimate as 3rd party code. What's less clear is that they belong in the core / stdlib, because core code typically should be idiomatic, and these styles are not idiomatic Python (for better or worse). Paul
Some guy did a similar thing, just better: https://github.com/JulienPalard/Pipe BTW, Changing the built-in iterators would require a change in the abc's. João Bernardo On 10 April 2014 19:09, Jacek Pliszka <jacek.pliszka@gmail.com> wrote:
OK, if method chaining is a problem, what about:
from StreamAlgebra import Map, Filter, Reduce from operator import add
range(100) | Filter(f1) | Map(f2) | Filter(f3) | Map(f4) | Reduce(add)
BR,
Jacek
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Fri, Apr 11, 2014 at 6:55 AM, João Bernardo <jbvsmo@gmail.com> wrote:
Some guy did a similar thing, just better: https://github.com/JulienPalard/Pipe
João Bernardo
It's interesting how that one puts the logic in the pipe elements instead of the iterators. Here's one that does the method chaining route, kind of an "Itertools: the class". http://code.activestate.com/recipes/498272-rich-iterator-wrapper/ I like that connects indexing to `islice`. Mark Daoust
On 10 April 2014 22:12, Jacek Pliszka <jacek.pliszka@gmail.com> wrote:
I would like to be able to write:
range(100).\ filter( f1 ).\ map( f2 ).\ filter( f3 ).\ map( f4 ).\ reduce(operator.add)
in addition to current Pythonic way of
reduce( operator.add, f4(x) for x in ( f2(y) for y in range(100) if f1(y)) if f3(x) )
The current Pythonic way is longname_1 = (f2(shortname) for shortname in range(100) if f1(shortname)) longname_2 = sum(f4(shortname) for shortname in longname_1 if f3(shortname)) If "shortname" and "longname_i" aren't to taste, find a context and assign good ones. Further, find me a plausibly-real-world example where my 2-line version is *less* readable than the Java one. Then I'll consider it a fair fight.
From: Jacek Pliszka <jacek.pliszka@gmail.com> Sent: Thursday, April 10, 2014 2:12 PM
What do you think about adding map, flatmap, filter and reduce methods to generator type ?
That wouldn't help your intended use case, because range is not a generator. In fact, most iterables are not generators. Non-iterators like list and dict, iterators defined as classes, iterators returned by builtins and C extension modules, etc. are not generators either. So, do you want to somehow add this to all possible iterable types? Or do you want to force people to wrap an iterable inside an unnecessary generator (x for x in spam) just so they can call these methods on the wrapper? Or… ? And this isn't just a side issue; this gets to the heart of the difference between Python and Java. Java requires everything to be hammered into its OO paradigm. These are methods in Java because everything has to be a method in Java. In C++, Haskell, OCaml, or just about anything besides Java (and its cousins, like Ruby and various .NET languages), they're generic or polymorphic or duck-typed free functions that are defined once and work on any type that makes sense, instead of methods that have to be defined on every possible type that might have a use for them.
I must admit I've seen and I like Java 8 notation and I think it might be more readable than Python way in a few occasions.
I would like to be able to write:
range(100).\ filter( f1 ).\ map( f2 ).\ filter( f3 ).\ map( f4 ).\ reduce(operator.add)
in addition to current Pythonic way of
reduce( operator.add, f4(x) for x in ( f2(y) for y in range(100) if f1(y)) if f3(x) )
This is not at all the Pythonic way to write it. And the fact that you think it is implies that maybe you're trying to solve a problem that doesn't exist. First, you're using reduce(add) instead of sum. I think this creates a false problem—if you think in terms of "map, filter, reduce", then there's no way to get rid of some of the function calls piling up on the left. But if you really think about it, map and filter are different from reduce: they transform an iterable into an iterable, so you can call them any number of times in your sequence of transformations, but reduce transforms an iterable into a single value, so you only call it once. Which means there aren't function calls piling up on the left, there's exactly one function call on the left. Also, you're trying to cram everything into one expression for no good reason, which forces you to come up with some idiosyncratic way to wrap it to 80 columns. In Java, creating unnecessary temporary variables is often considered an anti-pattern, probably because of its C heritage (where it can be a performance issue). In Python, this is instead a very common idiom. Let's start with the simplest possible way to write this: r = range(100) r = filter(f1, r) r = map(f2, r) r = filter(f3, r) r = map(f4, r) Now, taking advantage of comprehensions: r = range(100) r = (f2(x) for x in r if f1(x)) r = (f4(x) for x in r if f3(x)) This may look like the syntax is hiding the real functionality, but that's only because the real functionality is invisible in your example, because you've named the functions f1, f2, f3, and f4. Try an example with realistic function names and it will look a lot different. And of course half the time, you don't actually have a function lying around, you just want to map or filter with some expression, in which case the sequence of comprehensions wins even bigger. Compare: found_squares = (x**2 for x in range(100) if x in found) weighted_sum = sum(x / dups[x] for x in r if x in found_dups) weighted_sum = range(100). \ filter(lambda x: x in found). \ # or found.__contains__ if you insist map(lambda x: x**2). \ filter(lambda x: x in found_dups). \ map(lambda x: x / dups[x]). reduce(operator.add) You really think the second one is more readable? Notice that in the first one, everything is happening in the order of the data flow, you're not piling up function calls on the left, etc.; all the advantages you're looking for. If you haven't read David Beazley's "Generator Tricks for System Programmers", google it for some great realistic examples (and some great background discussion, too).
participants (7)
-
Andrew Barnert
-
Haoyi Li
-
Jacek Pliszka
-
Joshua Landau
-
João Bernardo
-
Mark Daoust
-
Paul Moore