[Python-ideas] Introduce collections.Reiterable
mistersheik at gmail.com
Fri Sep 20 12:18:47 CEST 2013
On Fri, Sep 20, 2013 at 5:48 AM, Steven D'Aprano <steve at pearwood.info>wrote:
> On Thu, Sep 19, 2013 at 11:02:57PM +1000, Nick Coghlan wrote:
> > On 19 September 2013 22:18, Steven D'Aprano <steve at pearwood.info> wrote:
> > > At the moment, dict views aren't directly iterable (you can't call
> > > next() on them). But in principle they could have been designed as
> > > re-iterable iterators.
> > That's not what iterable means. The iterable/iterator distinction is
> > well defined and reflected in the collections ABCs:
> Actually, I think the collections ABC gets it wrong, according to both
> common practice and the definition given in the glossary:
Where does the glossary disagree with collections.abc?
> More on this below.
> As for my comment above, dict views don't obey the iterator protocol
> themselves, as they have no __next__ method, nor do they obey the
> sequence protocol, as they are not indexable. Hence they are not
> *directly* iterable, but they are *indirectly* iterable, since they have
> an __iter__ method which returns an iterator.
What you're calling "indirectly iterable" is what the docs call "Iterable"
and what collections.abc call Iterable, right?
> I don't think this is a critical distinction. I think it is fine to call
> views "iterable", since they can be iterated over. On the rare occasion
> that it matters, we can just do what I did above, and talk about objects
> which are directly iterable (e.g. iterators, sequences, generator
> objects) and those which are indirectly iterable (e.g. dict views).
> > * iterables are objects that return iterators from __iter__.
> That definition is incomplete, because iterable objects include those
> that obey the sequence protocol. This is not only by long-standing
> tradition (pre-dating the introduction of iterators, if I remember
> correctly), but also as per the definition in the glossary. Alas,
> collections.Iterable gets this wrong:
> py> class Seq:
> ... def __getitem__(self, index):
> ... if 0 <= index < 5: return index+1000
> ... raise IndexError
> py> s = Seq()
> py> isinstance(s, Iterable)
> py> list(s) # definitely iterable
> [1000, 1001, 1002, 1003, 1004]
PEP 3119 makes it clear that isinstance( collections.Sequence) is the de
facto way of checking whether something is a sequence. Casting to list is
not the de facto way. Therefore, Seq is neither Iterable nor a Sequence
according to collections.abc. If you inherit from the collections.Sequence
(you'll need to implement __len__) you'll get the Iterable stuff for free
as desired: Sequence subclasses Iterable.
> (Note that although Seq obeys the sequence protocol, and is can be
> iterated over, it is not a fully-fledged Sequence since it has no
I guess we disagree that Seq obeys the sequence protocol.
> I think this is a bug in the Iterable ABC, but I'm not sure how one
> might fix it.
> > * iterators are the subset of iterables that return "self" from
> > __iter__, and expose a next (2.x) or __next__ (3.x) method
> That is certainly correct. All iterators are iterables, but not all
> iterables are iterators.
> > That "iterators return self from __iter__" is important, since almost
> > everywhere Python iterates over something, it call "_itr = iter(obj)"
> > first.
> And then falls back on the sequence protocol.
> > So, my question is a genuine one. While, *in theory*, an object can
> > define a stateful __iter__ method that (e.g.) only works the first
> > time it is called, or returns a separate object that still stores it's
> > "current position" information on the original container, I simply
> > can't think of a non-pathological case where "isinstance(obj,
> > Iterable) and not isinstance(obj, Iterator)" would give the wrong
> > answer.
> > In theory, yes, an object could obviously pass that test and still not
> > be Reiterable, but I'm interested in what's true in *practice*.
> I don't think you and I are actually in disagreement here. This is
> Python, and one could write an iterator class that is reiterable, or an
> iterable object (as determined by isinstance) which cannot be iterated
> over, but I think we can dismiss them as pathological cases. Even if
> such unusual objects are useful, it is the caller's responsibility, not
> the callee's, to use them safely and appropriately with functions that
> are expecting them.
Is it possible minimize the mental load on the caller by encapsulating the
distinction between parameters that accept iterables and reiterables? One
of the big problems with C++ for example is the great care that must be
taken, e.g. to not write past the ends of arrays. A small mistake can take
a week to track down. One does become more careful with years of
experience, but it is much simpler if the language prevents such
catastrophes. For me, Python has been this language in many ways.
Reiterables would be another such defensively motivated distinction. Of
course, you could just ask callers to "be more careful", but I don't see
the problem with fixing the language specification so that Antoine's
Reiterable adaptor works properly.
> Python-ideas mailing list
> Python-ideas at python.org
> You received this message because you are subscribed to a topic in the
> Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit
> To unsubscribe from this group and all its topics, send an email to
> python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas