[Python-ideas] Suggested MapView object (Re: __len__() for map())
Chris Barker - NOAA Federal
chris.barker at noaa.gov
Tue Dec 11 12:01:27 EST 2018
Perhaps I got confused by the early part of this discussion.
My point was that there is no “map-like” object at the Python level.
(That is no Map abc).
Py2’s map produced a sequence. Py3’s map produced an iterable.
So any API that was expecting a sequence could accept the result of a
py2 map, but not a py3 map. There is absolutely nothing special about
map here.
The example of range has been brought up, but I don’t think it’s
analogous — py2 range returns a list, py3 range returns an immutable
sequence. Because that’s as close as we can get to a sequence while
preserving the lazy evaluation that is wanted.
I _think_ someone may be advocating that map() could return an
iterable if it is passed a iterable, and a sequence of it is passed a
sequence. Yes, it could, but that seems like a bad idea to me.
But folks are proposing a “map” that would produce a lazy-evaluated
sequence. Sure — as Paul said, put it up on pypi and see if folks find
it useful.
Personally, I’m still finding it hard to imagine a use case where you
need the sequence features, but also lazy evaluation is important.
Sure: range() has that, but it came at almost zero cost, and I’m not
sure the sequence features are used much.
Note: the one use-case I can think of for a lazy evaluated sequence
instead of an iterable is so that I can pick a random element with
random.choice(). (Try to pick a random item from. a dict), but that
doesn’t apply here—pick a random item from the source sequence
instead.
But this is specific example of a general use case: you need to access
only a subset of the mapped sequence (or access it out of order) so
using the iterable version won’t work, and it may be large enough that
making a new sequence is too resource intensive.
Seems rare to me, and in many cases, you could do the subsetting
before applying the function, so I think it’s a pretty rare use case.
But go ahead and make it — I’ve been wrong before :-)
-CHB
Sent from my iPhone
> On Dec 11, 2018, at 6:47 AM, Steven D'Aprano <steve at pearwood.info> wrote:
>
>> On Mon, Dec 10, 2018 at 05:15:36PM -0800, Chris Barker via Python-ideas wrote:
>> [...]
>> I'm still confused -- what's so wrong with:
>>
>> list(map(func, some_iterable))
>>
>> if you need a sequence?
>
> You might need a sequence. Why do you think that has to be an *eager*
> sequence?
>
> I can think of two obvious problems with eager sequences: space and
> time. They can use too much memory, and they can take too much time to
> generate them up-front and too much time to reap when they become
> garbage. And if you have an eager sequence, and all you want is the
> first item, you still have to generate all of them even though they
> aren't needed.
>
> We can afford to be profligate with memory when the data is small, but
> eventually you run into cases where having two copies of the data is one
> copy too many.
>
>
>> You can, of course mike lazy-evaluated sequences (like range), and so you
>> could make a map-like function that required a sequence as input, and would
>> lazy evaluate that sequence. This could be useful if you weren't going to
>> work with the entire collection,
>
> Or even if you *are* going to work with the entire collection, but you
> don't need them all at once. I once knew a guy whose fondest dream was
> to try the native cuisine of every nation of the world ... but not all
> in one meal.
>
> This is a classic time/space tradeoff: for the cost of calling the
> mapping function anew each time we index the sequence, we can avoid
> allocating a potentially huge list and calling a potentially expensive
> function up front for items we're never going to use. Instead, we call
> it only on demand.
>
> These are the same principles that justify (x)range and dict views. Why
> eagerly generate a list up front, if you only need the values one at a
> time on demand? Why make a copy of the dict keys, if you don't need a
> copy? These are not rhetorical questions.
>
> This is about avoiding the need to make unnecessary copies for those
> times we *don't* need an eager sequence generated up front, keeping the
> laziness of iterators and the random-access of sequences.
>
> map(func, sequence) is a great candidate for this approach. It has to
> hold onto a reference to the sequence even as an iterator. The function
> is typically side-effect free (a pure function), and if it isn't,
> "consenting adults" applies. We've already been told there's at least
> one major Python project, Sage, where this would have been useful.
>
> There's a major functional language, Haskell, where nearly all sequence
> processing follows this approach.
>
> I suggest we provide a separate mapview() type that offers only the lazy
> sequence API, without trying to be an iterator at the same time. If you
> want an eager sequence, or an iterator, they're only a single function
> call away:
>
> list(mapview_instance)
> iter(mapview_instance) # or just stick to map()
>
> Rather than trying to guess whether people want to treat their map
> objects as sequences or iterators, we let them choose which they want
> and be explicit about it.
>
> Consider the history of dict.keys(), values() and items() in Python 2.
> Originally they returned eager lists. Did we try to retrofit view-like
> and iterator-like behaviour onto the existing dict.keys() method,
> returning a cunning object which somehow turned from a list to a view to
> an iterator as needed? Hell no! We introduced *six new methods* on
> dicts:
>
> - dict.iterkeys()
> - dict.viewkeys()
>
> and similar for items() and values().
>
> Compared to that, adding a single variant on map() that expects a
> sequence and returns a view on the sequence seems rather timid.
>
>
>
> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
More information about the Python-ideas
mailing list