[Python-ideas] __iter__ implies __contains__?

Guido van Rossum guido at python.org
Mon Oct 3 05:58:48 CEST 2011


On Sun, Oct 2, 2011 at 8:45 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 10/2/2011 1:28 PM, Guido van Rossum wrote:
>> The problem here seems to be that collections/abc.py defines Iterable
>> to have __iter__ but not __contains__, but the Python language defines
>> the 'in' operator as trying __contains__ first, and if that is not
>> defined, using __iter__.
>>
>> This is not surprising given Python's history, but it does cause some
>> confusion when one compares the ABCs with the actual behavior. I also
>> think that the way the ABCs have it makes more sense -- for single-use
>> iterables (like files) the default behavior of "in" exhausts the
>> iterator which is costly and fairly useless.
>>
>> Now, should we change "in" to only look for __contains__ and not fall
>> back on __iter__? If we were debating Python 3's feature set I would
>> probably agree with that, as a clean break with the past and a clear
>> future. Since we're debating Python 3.3, however, I think we should
>> just lay it to rest and use the fallback solution proposed: define
>> __contains__ on files to raise TypeError
>
> That would break legitimate code that uses 'in file'.
> The following works as stated:
>
> if 'START\n' in f:
>  for line in f:
>    <process lines after the START line>
> else:
>  <there are none>
>
> There would have to be a deprecation process. But see below.

Hm. That code sample looks rather artificial. (Though now that I have
seen it I can't help thinking that it might fit the bill for
somebody... :-)

>> and leave the rest alone.
>> Maybe make a note for Python 4. Maybe add a recommendation to PEP 8 to
>> always implement __contains__ if you implement __iter__.
>
> [Did you mean __next__?]

No, I really meant __iter__. Because in Python 4 I would be okay with
not using a loop as a fallback if __contains__ doesn't exist. So if in
Py3 "x in a" works by using __iter__, you would have to keep it
working in Py4 by defining __contains__.

And no, I don't expect Py4 within this decade...

> It seems to me better that functions that need a re-iterable non-iterator
> input should check for the absence of .__next__ to exclude *all* iterables,
> including file objects. There is no need to complicate out nice, simple,
> minimal iterator protocol.
>
> if hasattr(reiterable, '__next__'):
>  raise TypeError("non-iterator required')

That's a different issue -- you're talking about preventing bad use of
__iter__ in the calling class. I was talking about supporting "in" by
the defining class.

>> But let's not break existing code that depends on the current behavior
>> -- we have better things to do than to break perfectly fine working
>> code in a fit of pedantry.

Still, most people in this thread seem to agree that "x in file" works
by accident, not by design, and is more likely to do harm than good,
and many have in fact proposed various more serious ways of making it
not work in (I presume) Py3.3.

-- 
--Guido van Rossum (python.org/~guido)



More information about the Python-ideas mailing list