[Python-ideas] __iter__ implies __contains__?

Terry Reedy tjreedy at udel.edu
Mon Oct 3 05:45:19 CEST 2011


On 10/2/2011 1:28 PM, Guido van Rossum wrote:
> On Sun, Oct 2, 2011 at 10:05 AM, Guido van Rossum<guido at python.org>  wrote:
>> On Sat, Oct 1, 2011 at 10:13 PM, Raymond Hettinger
>> <raymond.hettinger at gmail.com>  wrote:
>>>
>>> On Oct 1, 2011, at 2:13 PM, Antoine Pitrou wrote:
>>>
>>> I honestly didn't know we exposed such semantics, and I'm wondering if
>>> the functionality is worth the astonishement:
>>>
>>> Since both __iter__ and __contains__ are deeply tied to "in-ness",
>>> it isn't really astonishing that they are related.
>>> For many classes, if "any(elem==obj for obj in s)" is True,
>>> then "elem in s" will also be True.
>>> Conversely, it isn't unreasonable to expect this code to succeed:
>
>>>     for elem in s:
>>>           assert elem in s
>
>>> The decision to make __contains__ work whenever __iter__ is defined
>>> probably goes back to Py2.2.   That seems to have worked out well
>>> for most users, so I don't see a reason to change that now.
>>
>> +1
>
> Correction, I read this the way Raymond meant it, not the way he wrote
> it, and hit Send too quickly. :-(
>
> The problem here seems to be that collections/abc.py defines Iterable
> to have __iter__ but not __contains__, but the Python language defines
> the 'in' operator as trying __contains__ first, and if that is not
> defined, using __iter__.
>
> This is not surprising given Python's history, but it does cause some
> confusion when one compares the ABCs with the actual behavior. I also
> think that the way the ABCs have it makes more sense -- for single-use
> iterables (like files) the default behavior of "in" exhausts the
> iterator which is costly and fairly useless.
>
> Now, should we change "in" to only look for __contains__ and not fall
> back on __iter__? If we were debating Python 3's feature set I would
> probably agree with that, as a clean break with the past and a clear
> future. Since we're debating Python 3.3, however, I think we should
> just lay it to rest and use the fallback solution proposed: define
> __contains__ on files to raise TypeError

That would break legitimate code that uses 'in file'.
The following works as stated:

if 'START\n' in f:
   for line in f:
     <process lines after the START line>
else:
   <there are none>

There would have to be a deprecation process. But see below.

> and leave the rest alone.
> Maybe make a note for Python 4. Maybe add a recommendation to PEP 8 to
> always implement __contains__ if you implement __iter__.

[Did you mean __next__?]
It seems to me better that functions that need a re-iterable 
non-iterator input should check for the absence of .__next__ to exclude 
*all* iterables, including file objects. There is no need to complicate 
out nice, simple, minimal iterator protocol.

if hasattr(reiterable, '__next__'):
   raise TypeError("non-iterator required')

> But let's not break existing code that depends on the current behavior
> -- we have better things to do than to break perfectly fine working
> code in a fit of pedantry.

-- 
Terry Jan Reedy




More information about the Python-ideas mailing list