[Python-ideas] Introduce collections.Reiterable

Sat Sep 21 23:08:06 CEST 2013

On Sep 21, 2013, at 0:21, Neil Girdhar <mistersheik at gmail.com> wrote:

> I'm happy with iterable and not iterator if it comes with a promise.  Then my first ABC is what what I probably want.  If not, then I think it's better to do something lke
> 
> class Reiterable(collections.Iterable,
>                  metaclass=collections.abc.ABCMeta):
>     @classmethod
>     def __subclasshook__(cls, subclass):
>         if (issubclass(subclass, collections.MappingView)
>             or issubclass(subclass, collections.Sequence)
>             or issubclass(subclass, collections.Set)
>             or issubclass(subclass, collections.Mapping)):
>             return True
>         return NotImplemented

Which leaves out numpy arrays, most sorted list and dict classes from PyPI, ElementTree and similar element/node/etc. types, ScriptingBridge/appscript collections, win32com IWhateverCollections, and all kinds of other types that can be reiterated, which are correctly diagnosed by Iterable and not Iterator.

I haven't tested all of them, so some could fail to register as Iterable (especially given the possibility that Iterable may be incorrect, as mentioned elsewhere on this thread). But getting false negatives on a few types and having to deal with them by fixing a bug is surely better than getting false negatives on all types and having to deal with them by adding new, otherwise-unnecessary code.

> Other classes can be added with register.

So anyone who wants to use your module with numpy or appscript or ElementTree has to find all of the iterable types the class exposes (some of which aren't part of the public API--in some the case of appscript or win32com the may even be built dynamically as needed) and register all of them?

You're putting the burden in the wrong place. Because you're worried that some class could theoretically be a non-reiterable non-iterator iterable, even though neither you nor anyone else can think of a sensible example of such a thing, you're requiring the user to certify that every iterable single class he uses is not pathological. That's not LBYL, that's perform a comprehensive survey and environmental impact report on the entire region and file papers in triplicate before you leap.

If you're really worried about this unlikely possibility making it hard to debug the use of your code with some as-yet-unknown type, there are easier ways to verify things. For example, if the iterable works the first time, but is empty the second, the user has given you a non-reiterable, and you can assert or raise appropriately, which will make the code error just as easy to debug as having forgotten to register with Reiterable--and far easier to debug than having mistakenly registered with Reiterable when they shouldn't have. Plus, this lets you test for exactly what you want, not just a rough approximation. You could just as easily verify that the first element of each iteration matches, to ensure that it's not a random-reiterable type like Terry discussed that would ruin your particular two-pass algorithm. Or whatever is appropriate.

> On Sat, Sep 21, 2013 at 3:04 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>> On Sep 20, 2013, at 21:52, Neil Girdhar <mistersheik at gmail.com> wrote:
>> 
>>> We discussed this upthread: I only want "not iterator" if not iterator promises reiterability. Right now, we have what may be a happy accident that can easily be violated by someone else.
>> 
>> And if you define your new ABC, it can be just as easily violated by someone else. In fact, it will be violated in the exact _same_ cases. There's no check you can do besides the reverse of the checks done by iterator.
>> 
>> More importantly, it's not just "a happy accident". I've asked repeatedly if anyone can come up with a single example of a non-iterator, non-reiterable iterator, or even imagine what one would look like, and nobody's come up with one. And it's not like iterators are some new feature nobody's had time to explore yet.
>> 
>> So, in order to solve a problem that doesn't exist, you want to add a new feature that wouldn't solve it any better than what we have today.
>> 
>>> Best,
>>> Neil
>>> 
>>> 
>>> On Sat, Sep 21, 2013 at 12:50 AM, Andrew Barnert <abarnert at yahoo.com> wrote:
>>>> On Sep 20, 2013, at 21:23, Neil Girdhar <mistersheik at gmail.com> wrote:
>>>> 
>>>>> I appreciate the discussion illuminating various aspects of this I hadn't considered. Finally, what I think I want is for
>>>>> * all sequences
>>>>> * all views
>>>>> * numpy arrays
>>>>> to answer yes to reiterable, and
>>>>> * all generators
>>>>> to answer no to reiterable.
>>>> 
>>>> All sequences, views, and numpy arrays answer no to iterator (and so do sets, mappings, etc.), and all generators answer yes (and so do the iterators you get back from calling iter on a sequence, map, filter, your favorite itertools function, etc.)
>>>> 
>>>> So you just want "not iterator". Even Haskell doesn't attempt to provide negative types like that. (And you can very easily show that it's iterator that's the normal type: it's syntactically checkable in various ways--e.g., it.hasattr('__next__'), but the only positive way to check reiterable is not just semantic, but destructive.)
>>>> 
>>>>> Best, Neil
>>>>> 
>>>>> On Fri, Sep 20, 2013 at 10:12 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>>>>>> Terry Reedy writes:
>>>>>> 
>>>>>>  > Dismissing legal code as 'pathological', as more than one person has,
>>>>>>  > does not cut it as a design principle.
>>>>>> 
>>>>>> But you don't even need to write a class with __getitem__() to get
>>>>>> that behavior.
>>>>>> 
>>>>>> >>> l = [11, 12, 13]
>>>>>> >>> for i in l:
>>>>>> ...  print(i)
>>>>>> ...  if i%2 == 0:
>>>>>> ...   l.remove(i)
>>>>>> ...
>>>>>> 11
>>>>>> 12
>>>>>> >>> l
>>>>>> [11, 13]
>>>>>> >>>
>>>>>> 
>>>>>> Of course the iteration itself is probably buggy (ie, the writer
>>>>>> didn't mean to skip printing '13'), but in general iterables can
>>>>>> change themselves.
>>>>>> 
>>>>>> Neil himself seems to be of two minds about such cases.  On the one
>>>>>> hand, he said the above behavior is built in to list, so it's
>>>>>> acceptable to him.  (I think that's inconsistent: I would say the
>>>>>> property of being completely consumed is built in to iterator, so it
>>>>>> should be acceptable, too.)  On the other hand, he's defined a
>>>>>> reiterable as a collection that when iterated produces the same
>>>>>> objects in the same order.
>>>>>> 
>>>>>> Maybe what we really want is for copy.deepcopy to do the right thing
>>>>>> with iterables.  Then code that doesn't want to consume consumable
>>>>>> iterables can do a deepcopy (including replication of the closed-over
>>>>>> state of __next__() for iterators) before iterating.
>>>>>> 
>>>>>> Or perhaps the right thing is a copy.itercopy that creates a new
>>>>>> composite object as a shallow copy of everything except that it clones
>>>>>> the state of __next__() in case the object was an iterator to start
>>>>>> with.
>>>>>> _______________________________________________
>>>>>> Python-ideas mailing list
>>>>>> Python-ideas at python.org
>>>>>> https://mail.python.org/mailman/listinfo/python-ideas
>>>>>> 
>>>>>> --
>>>>>> 
>>>>>> ---
>>>>>> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
>>>>>> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/OumiLGDwRWA/unsubscribe.
>>>>>> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>> 
>>>>> _______________________________________________
>>>>> Python-ideas mailing list
>>>>> Python-ideas at python.org
>>>>> https://mail.python.org/mailman/listinfo/python-ideas
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130921/592fdc6d/attachment-0001.html>