[Python-ideas] Re: Enhancing iterator objects with map, filter, reduce methods

Nov. 28, 2021

      ...
1.  Is dataflow/fluent programming distinguishable from whatever it
    was that Guido didn't like about method chaining idioms?  If so,
    how?
Are you referring to this 
https://mail.python.org/pipermail/python-dev/2003-October/038855.html?
He mentioned (if I may summarize) (1) familiarity with the API and 
(2) making mistakes. In fluent programming (at least in the implementation
...
2.  Is the method chaining syntax preferable to an alternative
    operator?
I don't have an answer to this. I personally like method chaining. And
Hi Stephen,

Stephen J. Turnbull wrote:
that I suggested for iterators), it is to make mistakes. Eg.

    pipeline([1,2,3]).reduce(lambda a,b: a+b).map(lambda x: x+1)

because some methods reduce therefore returning a non-sequence
object, instead of self.

But with all due respect, you _do_ need to be familiar with the API to use
it, so I don't see why (1) could be an issue. And with familiarity, you would
make fewer mistakes.

this is also because the handful lot of languages that I'm familiar with
use method chaining.
...
3.  Is there an obvious choice for the implementation?  Specifically,
    there are at least three possibilities:
    a.  Restrict it to mutable sequences, and do the transformations
        in place.
    b.  Construct nested iterators and listify the result only if
        desired.
    c.  Both.
The choice for this implementation is to replace chaining function calls
from the itertools module (incl. map and filter):
list(starmap(..., filter(..., chain(..., map(..., ...))))

with something similar, but read from left to right instead. And because
the functions from itertools module take in any iterable (regardless of
mutability), the implementation should also do the same, which is (b)
in your list.
...
4.  Is this really so tricky that the obvious implementation of the
    iterator approach (Chris's) needs to be in the stdlib with tons of
    methods on it, or does it make more sense have applications write
    one with the specific methods needed for the application?
      Or perhaps instead of creating a generic class prepopulated with
    methods, maybe this should be a factory function which takes a
    collection of mapping functions, and adds them to the dataflow
    object on the fly?
I think this boils down to the itertools module (was thinking about 
it over the weekend).

I find that the itertools module and some builtins like map, filter 
don't do themselves justice when chained together. It's okay for 
one or two function calls. But the design made it seem like it was 
never meant to be chained together (or was it?). Attempts to do 
so leads to code that must be read from right to left, making it an 
awkward API to use for transforming collections (which most of us 
might agree). 

If it was indeed built for one or two function calls, then I would 
argue that it's not really a useable or practical module, because 
a lot of times we perform not just one or two but multiple 
transformations on collections.

So, to answer this question, I don't think the issue is whether the 
implementation is tricky such that the stdlib should do it. Rather, 
*our* itertools module itself is tricky to use, because fundamentally 
its design is not user-friendly, or rather limiting to the users. And this 
is a problem. Head over to StackOverflow and most people wouldn't 
recommend using it. It's not well-liked (except maybe by Lisp-ers). 
It's most probably because of what I mentioned in the previous 
paragraph.

What does this mean for us? I think it's a good opportunity for us 
to rethink the design to make it more usable. Hence, I'm putting 
the onus on us (stdlib), instead of relying on 3rd party libraries to 
improve on it. 

As a proposal to improve the design, I suggested above a higher-
level API for the itertools module that says "oh you want to use the 
itertools module? yeah it's a low-level module that is not meant to 
be used directly so here's a higher level API you can use instead."
The implementation doesn't have to be method chaining because 
I'm generally proposing a higher-level API.

Now, I've said that the useability of the itertools module is a problem 
pretty much in a matter-of-fact manner and putting it on us to rework
it. But what does everyone else think about this? Do you share the 
same concerns too?
...
5.  map() and zip() take multiple iterables.  Should this feature
    handle those cases?  Note that the factory function approach
    allows the client to make this decision for themselves.
I would say nope for map and yes for zip, viewing it from the perspective
of the underlying iterator. The .map() instance method only refers to the 
underlying iterator so it should only take a function that will transform every
element in the underlying iterator. For zip, we can take multiple iterables 
because we are zipping them with the underlying iterator. But this is my 
opinion and is a detail that we can come to a consensus to later.
...
6.  What are the names that you propose for the APIs?  They need to
    indicate the implementations since there are various ways to
    implement.
I propose the names be similar to those in builtin + itertools
map (map_every to indicate a different implementation? though not 
    conventional), filter, reduce, starmap, starfilter, zip, enumerate

some from the Itertools Recipes section that might be more common:

    flatten, nth, take

some 'reductional' ones:

    reduce, sum, all, any, min, max, join (for string iterators)

some hybrid

    flat_map, filter_map

some which 

    for_each (returning None, though this is a for-loop).

[Python-ideas] Re: Enhancing iterator objects with map, filter, reduce methods

Raimi bin Karim