
Andrew Barnert writes:
On May 10, 2020, at 22:36, Stephen J. Turnbull <turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Andrew Barnert via Python-ideas writes:
A lot of people get this confused. I think the problem is that we don’t have a word for “iterable that’s not an iterator”,
I think part of the problem is that people rarely see explicit iterator objects in the wild. Most of the time we encounter iterator objects only implicitly.
We encounter iterators in the wild all the time, we just don’t usually _care_ that they’re iterators instead of “some kind of iterable”, and I think that’s the key distinction you’re looking for.
It *is* the distinction I'm making with the word "explicit". I never use "next" on an open file. I'm not sure your more precise statement is better. I think the real difference is that I'm thinking of "people" as including my students who have no clue what an iterator does and don't care what an iterable is, they just cargo cult with open("file") as f: for line in f: do_stuff(line) while as you point out (and I think is appropriate in this discussion) some people who are discussing proposed changes are using the available terminology incorrectly, and that's not good.
Still, having clear names with simple definitions would help that problem without watering down the benefits.
I disagree. I agree there's "amortized zero" cost to the crowd who would use those names fairly frequently in design discussions, but there is a cost to the "lazy in the technical sense" programmer, who might want to read the documentation if it gave "simple answers to simple questions", but not if they have to wade through a thicket of "twisty subtle definitions all alike" to get to the simple answer, and especially not if it's not obvious after all that what the answer is. It also makes conversations with experts fraught, as those experts will tend to provide more detail and precision than the questioner wants (speaking for myself, anyway!) "Not every one-sentence explanation needs terminology in the documentation."
But that last thing is exactly the behavior you expect from “things like list, dict, etc.”, and it’s hard to explain, and therefore hard to document.
Um, you just did *explain* it, quite well IMHO, you just didn't *name* it. ;-)
Well, it was a long, and redundant, explanation, not something you’d want to see in the docs or even a PEP.
The part I was referring to was the three or so lines preceding in which you defined the behavior desired for views etc. I guess to define terminology for all the variations that might be relevant would be long (and possibly unavoidably redundant).
“lazy” as in it creates something that acts like a list or a set, but hasn’t actually stored a list or set or other data structure in memory or done a bunch of up-front CPU work. You’re right that a more precise definition would probably include range but not dict_keys, but I think people do use it in a way that includes both, and that’s part of the reason they’re equally confused into thinking both are iterators.
This is another reason why I am not optimistic that more (and preferably, better ;-) terminology would help. We're already abusing terms that have fairly precise definitions in an analogous but wrong context. And there are better analogies. Instead of saying "views are lazy X" (I don't even know what X is being made lazy here!), we could borrow from Scheme and say views are "hygienic aliases". But we don't. Before we invent more terms for Humpty Dumpty to abuse, we should teach Humpty Dumpty a thing or two about the words he already knows.
And not having names for things, even if they _are_ well explained somewhere, makes that problem hard to solve. A shorthand description is usually vague and it’s not clear where to go to to get clarification; a name is at least as vague but it’s obvious what to search for to get the exact definition (if there’s not already a link right there).
In principle, I agree. In practice, nothing's perfect, and there are contravailing issues (especially misuse of the new names).
Isn't manual reset exactly what you want from a resettable iterator, though?
Yes. I certainly use seek(0) on files, and it’s a perfectly cromulent concept, it’s just not the concept I’d want on a range or a keys view or a sequence slice.
But you *don't* use seek(0) on files (which are not iterators, and in fact don't actually exist inside of Python, only names for them do). You use them on opened *file objects* which are iterators. When you open a file again, by default you get a new iterator which begins at the beginning, as you want for those others. My point is that none of the other types you mention are iterators. The difference with files is just that they happen to exist in Python as iterables. But after r = range(n) ri = iter(range) for i in ri: if i > n_2: break you want the next "for j in ri:" to start where you left off, no? Did you confuse iterable with iterator, or did I miss your point, or is there a third possibility? ;-) Steve