[Python-ideas] Membership of infinite iterators
Terry Reedy
tjreedy at udel.edu
Mon Oct 16 17:12:15 EDT 2017
On 10/15/2017 9:12 PM, Jason Campbell wrote:
> I recently came across a bug where checking negative membership
> (__contains__ returns False) of an infinite iterator will freeze the
> program.
>
>
> It may seem a bit obvious, but you can check membership in range for
> example without iterating over the entire range.
A range is a finite (non-iterator) *iterable*, not an *iterator*. In
particular, a range is a finite sequence. In Python 2, ranges were
implemented as lists. Python 3 now uses a more compact implementation.
Either way, a ranges have a .__contains__ method. (In Py3 it is O(1)
instead of O(n)).
> `int(1e100) in range(int(1e101))` on my machine takes about 1.5us
> `int(1e7) in itertools.count()` on my machine takes about 250ms (and
> gets worse quite quickly).
Non-iterator iterables and iterators are different things. It you
compared iter(range(n)) with itertools.count(n), you would get similar
results.
> Any membership test on the infinite iterators that is negative will
> freeze the program as stated earlier, which is odd to me.
Membership testing with iterators is generally a bad idea. In general,
iterators do not have a .__contains__ method, so at best, one exhausts
the iterator, making it useless. In particular, generators do not have
any non-iterator, non-generator methods. The itertools module has
iterator classes rather than generator functions because it is coded in
C, but they are closely equivalent in behavior to the generator
functions given in the doc.
> itertools.count can use the same math as range
You could use range(start, 10000000000000000000[, step]) instead of
count(start[, step]), or wrap count in an infinite sequence Count class
that has all the methods and arithmetic of range. The __iter__ method
would return a count iterator.
> itertools.cycle could use membership from the underlying iterable
If the underlying iterable is an iterator, this does not work. You
could define a Cycle class that requires that the input have .__contain__.
> itertools.repeat is even easier, just compare to the repeatable element
Again, create a Repeat(ob) class whose .__iter__ returns repeat(ob) and
that has .__contains__(x) returning 'x == ob'.
> Does anyone else think this would be useful?
itertools is not going to switch from iterators to non-iterator iterables.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list