On Sun, Oct 2, 2011 at 10:20 PM, Greg Ewing
Terry Reedy wrote:
I had the same reaction as Guido. Iteration is the *only* generic way to tell if an item is in a sequence or other collection.
I think the root cause of this problem is our rather cavalier attitude to the distinction between iterables and iterators. They're really quite different things, but we started out with the notion that "for x in stuff" should be equally applicable to both, and hence decided to give every iterator an __iter__ method that returns itself. By doing that, we made it impossible for any generic protocol function to reliably tell them apart.
If I were designing the iterator protocol over again, I think I would start by recognising that starting a new iteration and continuing with an existing one are very different operations, and that you almost always intend the former rather than the latter. So I would declare that "for x in stuff" always implies a *new* iteration, and devise another syntax for continuing an existing one, such as "for x from stuff". I would define iterables and iterators as disjoint categories of object, and give __iter__ methods only to iterables, not iterators.
However, at least until Py4k comes around, we're stuck with the present situation, which seems to include accepting that "x in y" will occasionally gobble an iterator that you were saving for later.
I think that's a fine analysis. -- --Guido van Rossum (python.org/~guido)