[Python-ideas] Membership of infinite iterators

Terry Reedy tjreedy at udel.edu
Mon Oct 16 17:12:15 EDT 2017


On 10/15/2017 9:12 PM, Jason Campbell wrote:
> I recently came across a bug where checking negative membership 
> (__contains__ returns False) of an infinite iterator will freeze the 
> program.
> 
> 
> It may seem a bit obvious, but you can check membership in range for 
> example without iterating over the entire range.

A range is a finite (non-iterator) *iterable*, not an *iterator*.  In 
particular, a range is a finite sequence.  In Python 2, ranges were 
implemented as lists.  Python 3 now uses a more compact implementation. 
Either way, a ranges have a .__contains__ method.  (In Py3 it is O(1) 
instead of O(n)).

> `int(1e100) in range(int(1e101))` on my machine takes about 1.5us
> `int(1e7) in itertools.count()` on my machine takes about 250ms (and 
> gets worse quite quickly).

Non-iterator iterables and iterators are different things. It you 
compared iter(range(n)) with itertools.count(n), you would get similar 
results.

> Any membership test on the infinite iterators that is negative will 
> freeze the program as stated earlier, which is odd to me.

Membership testing with iterators is generally a bad idea.  In general, 
iterators do not have a .__contains__ method, so at best, one exhausts 
the iterator, making it useless.  In particular, generators do not have 
any non-iterator, non-generator methods.  The itertools module has 
iterator classes rather than generator functions because it is coded in 
C, but they are closely equivalent in behavior to the generator 
functions given in the doc.

> itertools.count can use the same math as range

You could use range(start, 10000000000000000000[, step]) instead of 
count(start[, step]), or wrap count in an infinite sequence Count class 
that has all the methods and arithmetic of range.  The __iter__ method 
would return a count iterator.

> itertools.cycle could use membership from the underlying iterable

If the underlying iterable is an iterator, this does not work.  You 
could define a Cycle class that requires that the input have .__contain__.

> itertools.repeat is even easier, just compare to the repeatable element

Again, create a Repeat(ob) class whose .__iter__ returns repeat(ob) and 
that has .__contains__(x) returning 'x == ob'.

> Does anyone else think this would be useful?

itertools is not going to switch from iterators to non-iterator iterables.

-- 
Terry Jan Reedy



More information about the Python-ideas mailing list