[Python-Dev] bool(iter([])) changed between 2.3 and 2.4

Wed Sep 21 01:37:01 CEST 2005

[Guido]
> I just finished debugging some code that broke after upgrading to
> Python 2.4 (from 2.3). Turns out the code was testing list iterators
> for their boolean value (to distinguish them from None). In 2.3, a
> list iterator (like any iterator) is always true. In 2.4, an exhausted
> list iterator is false; probably by virtue of having a __len__()
> method that returns the number of remaining items.
> 
> I realize that this was a deliberate feature, and that it exists in
> 2.4 as well as in 2.4.1 and will in 2.4.2; yet, I'm not sure I *like*
> it. Was this breakage (which is not theoretical!) considered at all?

It was not considered.  AFAICT, 2.3 code assuming the Boolean value of
an iterator being true was relying on an accidental implementation
detail that may not also be true in Jython, PyPy, etc.  Likewise, it is
not universally true for arbitrary class based iterators which may have
other methods including __nonzero__ or __len__.  The Boolean value of an
iterator is certainly not promised by the iterator protocol as specified
in the docs or the PEP.  The code, bool(it), is not really clear about
its intent and seems a little weird to me.  The reason it wasn't
considered was that it wasn't on the radar screen as even a possible use
case.

On a tangential note, I think in 2.2 or 2.3, we found a number of bugs
related to None testing.  IIRC, the outcome of that conversation was a
specific recommendation to NOT determine Noneness by Boolean tests.
That recommendation ended-up making it into PEP 290:

    http://www.python.org/peps/pep-0290.html#testing-for-none

[Fred]
> think iterators shouldn't have length at all:
> they're *not* containers and shouldn't act that way.

Some iterators can usefully report their length with the invariant:
   len(it) == len(list(it)).

There are some use cases for having the length when available.  Also,
there has been plenty of interest in being able to tell, when possible,
if an iterator is empty without having to call it.  AFAICT, the only
downside was Guido's bool(it) situation.

FWIW, the origin of the idea came from reading a comp-sci paper about
ways to overcome the limitations of linking operations together using
only iterators (the paper's terminology talked about map/fold
operations).  The issue was that decoupling benefits were partially
offset by the loss of useful information about the input to an operation
(i.e. the supplier may know and the consumer may want to know the input
size, the input type, whether the elements are unique, whether the data
is sorted, its provenance, etc.)

Raymond