On Sat, May 9, 2020 at 1:03 PM Andrew Barnert <abarnert@yahoo.com> wrote:

I haven’t read the whole thing yet, but one thing immediately jumped out at me:

> and methods on containers, such as dict.keys return iterators in Python 3, 

No they don’t. They return views—objects that are collections in their own right (in particular, they’re not one-shot; they can be iterated over and over) but just delegate to another object rather than storing the data.

Thanks -- that's that kind of thing that led me to say that this is probably not ready for a PEP.

but I don't think that invalidates the idea at all -- there is debate about what an "islice" should return, but an iterable view would be a good option.

I'm inclined to think that it would be a bad idea to have it return a full sequence view object, and not sure it should do anything other than be iterable.

> People also commonly say that range is an iterator instead of a function that returns a list in Python 3,

Sure, but I don't say that :-) -- a range object is actually s pretty full immutable sequence -- which is pretty handy. But when people say that, they are often being careless, rather than wrong.

At least I 'd like to claim that about my saying dict.keys() return an iterator ;-) -- the point of that part of the document is that many things in Py3 do NOT return full realized copies, like py2 did.

> And this is important here, because a view is what you ideally _want_. The reason range, key view, etc. are views rather than iterators isn’t that it’s easier to implement or explain or anything, it’s that it’s a little harder to implement and explain but so much more useful that it’s worth it. It’s something people take advantage of all the time in real code.

Maybe -- but "all the time?" I'd vernture to say that absolutiely the most comon thing done with, e.g. dict.keys() is to iterate over it. But yes, having it be a view with other features is handy.

> And this is pretty easy to implement. I have a quick and dirty version at https://github.com/abarnert/slices, but I think I may have a better version somewhere with more unit tests.

Thanksl -- I'll take a look.

> For prior art specifically on slicing as a view, rather than just views in general, see memoryview (which only works on buffers, not all sequences) and NumPy (which is weird in many ways, but people rely on slicing giving you a storage-sharing view)

I am a long-time numpy user, and yes, I very much take advantage of the memory sharing view.

But I do not think that that would be a good idea for the standard libary. numpy slices return a full-fledged numpy array, which shares a data view with the it's "host" -- this is really helpful for performance reasons -- moving large blocks of data around is expensive, but it's also pretty confusing. And it would be a lot more problematic with, e.g. lists, as the underlying buffer can be reallocated -- numpy arrays are mutable, but not re-sizable, once you've made one its data buffer does not change.

> The reason I never proposed this for the stdlib (even though that would allow adding methods directly onto the builtin container types, as your proposal does) is that I always want to build a _complete_ view library, with replacements for map, zip, enumerate, all of itertools, etc., and with enough cleverness to present exactly as much functionality as is possible.

And I have my doubts about it anyway :-)

> But just replacing islice is a much simpler task (mainly because the input has to be a sequence and the output is always a sequence, so the only complexity that arises is whether you want to allow mutable views into mutable sequences), and it may well be useful on its own.

Agreed. And while yes, dict_keys and friends are not JUST iterartors, they also aren't very functional views, either. They are not sequences, certainly not mutabe sequences. And:

> (in particular, they’re not one-shot; they can be iterated over and over)

yes, but they are only a single iterator -- if you call iter() on one you always get the same one back, and it's state is preserved.

So yes, you can iterate over more than once, but iter() only resets after it's been exhausted before.

In short -- not having thought about it deeply at all, but I'm thinking that making an SliceIterator very similar to dict_keys and friends would make a lot of sense.


Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython