On Sat, May 9, 2020 at 9:11 PM Andrew Barnert <abarnert@yahoo.com> wrote:

> I don’t think it invalidates the basic idea at all, just that it suggests the design should be different.

Originally, dict returned lists for keys, values, and items. In 2.2, iterator variants were added. In 3.0, the list and iterator variants were both replaced with view versions, which were enough of an improvement that they were backported to 2.x. Because a view does cover almost all of the uses of both a sequence copy and an iterator. And I think the same is true here.

Probably yes.
I'm inclined to think that it would be a bad idea to have it return a full sequence view object, and not sure it should do anything other than be iterable.

Why? What’s the downside to being able to do more with them for the same performance cost and only a little more up-front design work?

I'm not worried about the design work -- writing a PEP is a LOT more work than writing the code for this kind of thing :-) And I'll bet folks smarter than me will want to help out with the code part, if this goes anywhere.
> And this is important here, because a view is what you ideally _want_. The reason range, key view, etc. are views rather than iterators isn’t that it’s easier to implement or explain or anything, it’s that it’s a little harder to implement and explain but so much more useful that it’s worth it. It’s something people take advantage of all the time in real code.

Maybe -- but "all the time?" I'd venture to say that absolutely the most common thing done with, e.g. dict.keys() is to iterate over it.

Really? When I just want to iterate over a dict’s keys, I iterate the dict itself.

True -- I was thinking more of ALL the various "iterables that were concretized lists in py2" -- dict_keys() is actually uniquie in that dict itself provides an iterator of the keys. -- I've seen a lot of code like so:

for k in dict.keys():
    ...

and

if k in dict.keys():
    ....

both of which are completely unnecessary. So actually, I'd say that dict.keys() gets used either less often, or when it's not really needed. But you're right, given that, when dict_keys is used when it should be, it would be for other reasons. I"ll bet it's kind of rare though.

And dict_items and dict_values are probably most often as iterables.

> That’s no more of a problem for a list slice view than for any of the existing views. The simplest way to implement a view is to keep a reference to the underlying object and delegate to it, which is effectively what the dict views do.

Fair enough. Though you still could get potentially surprising behavior if the original sequence's length is changed.

> (You _could_ instead refuse to allow expanding a sequence when there’s a live view, as bytearray does with memoryview, but I don’t think that’s necessary here. It’s only needed there a consequence of the fact that the buffer protocol is provided in C rather than in Python. For a slice view, it would just make things more complicated and less functional for no good reason.)

But it would also be, well, weird -- you create a view with a slice if a given length, and then the underlying sequence is changed, and then your view object is, well, totally different, it may not even exist (well, be length-zero, I suppose).

And you probably don't want to lock the "host" anyway -- that could be very confusing if the view is kept all be somewhere far from the code trying to change the sequence.

This is all a bitless complicated for a the dict views, becasue none of them are providing a mapping interface anyway.

The other question is -- should a view of a mutable sequence be mutable (and mutate the underlying sequence)? That's how numpy arrays work, but it does require a certain fitness to keep track of.
> But just replacing islice is a much simpler task (mainly because the input has to be a sequence and the output is always a sequence, so the only complexity that arises is whether you want to allow mutable views into mutable sequences), and it may well be useful on its own.

Agreed. And while yes, dict_keys and friends are not JUST iterartors, they also aren't very functional views, either. They are not sequences, 

That’s not true. They are very functional—as functional as reasonably makes sense. The only reason they’re not Sequences is that they’re views on dicts, so indexing makes little sense, but set operations do—and they are in fact Sets. (Except for values.)

certainly not mutabe sequences.

Well, yes, but mutating a dict through its views wouldn’t make sense in the first place:

    >>> d = {1: 2}
    >>> k = dict.keys()
    >>> k |= 3

not for keys, but it would at least be possible for dict_items, and even potentially for dict_values, though yes, that would be really confusing.

> So I think it might be better to leave mutation out of the original version anyway unless someone has a need to it (at which point we can use the examples to think through the best answers to the design issues).

Yeah, I'm heading that way too.

> yes, but they are only a single iterator -- if you call iter() on one you always get the same one back, and it's state is preserved.

No, that’s not true. Each call to iter() returns a completely independent iterator each time, with its own independent state that starts at the head of the view.

Sorry -- total brain blip on my part -- I tested that out before posted, but had a typo that totally invalidated the test --arrg!

I'm still a bit confused about what a dict.* view actually is -- for instance, a dict_keys object pretty much acts like a set, but it isn't a subclass of set, and it has an isdisjoint() method, but not .union or any of the other set methods. But it does have what at a glance looks like pretty complete set of dunders....

Anyway, a Sequence view is simpler, because it could probably simply be an immutable sequence -- not much need for contemplating every bit of the API.

I do see a possible objection here though. Making a small view of a large sequence would keep that sequence alive, which could be a memory issue. Which is one reason why sliced don't do that by default. And it could simply be a buyer beware issue. But the more featureful you make a view, the more likely it is that they will get used and passed around and kept alive without the programmer realizing the implications of that.

Food for thought.

Now I need to think about how to write this all up -- which is why I wasn't sure I was ready to bring this up bu now I have, so more to do!

PR's accepted on my draft!

https://github.com/PythonCHB/islice-pep/blob/master/islice.py

    >>> d[7] = 8
    >>> next(i1)
    RuntimeError: dictionary changed size during iteration
    >>> i3 = iter(k)
    >>> next(i3)

That's probably a feature we'd want to emulate.

> Basically, views are not like iterators at all, except in that they save time and space by being lazy.

Well, this is a vocabulary issue -- an "iterable" and "iterator" is anything that follows the protocol, so yes, they very much ARE iterables (and iterators) even though they also have some additional behavior.

Which is why it's not wrong to say that a range object is an iterator, but is IS wrong to say that it's Just and iterator ...

> Such a resettable-iterator thing (which would have some precedent in file objects, I suppose) would actually be harder to Implement, on top of being less powerful and potentially confusing. And the same is true for slices.

but the dict_keys iterator does seem to do that ...

In [48]: dk                                                                    
Out[48]: dict_keys(['0', '1', '2', '3', '4', '5', '6', '7', '8', '9'])

In [49]: list(dk)                                                              
Out[49]: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

In [50]: list(dk)                                                              
Out[50]: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']


In short -- not having thought about it deeply at all, but I'm thinking that making an SliceIterator very similar to dict_keys and friends would make a lot of sense.

Yes, as long that means being a full-featured normal collection (in this case a Sequence rather than a Set), not a resettable iterator.

Yup -- I was pretty much only disagreeing due to my ignorance of the dict views --thanks for lesson!

-CHB


--
Christopher Barker, PhD

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython