[issue10109] itertools.product with infinite iterator cause MemoryError.
Sumudu Fernando
report at bugs.python.org
Wed Jan 18 12:01:34 CET 2012
Sumudu Fernando <sumuduf at gmail.com> added the comment:
I don't agree with the response to this.
It is true that as implemented (at least in 2.7, I don't have 3.x handy to check) itertools.product requires finite iterables. However this seems to be simply a consequence of the implementation and not part of the "spirit" of the function, which as falsetru pointed out is stated to be "equivalent to nested for-loops in a generator expression".
Indeed, implementing product in Python (in a recursive way) doesn't have this problem.
Perhaps a more convincing set of testcases to show why this could be considered a problem:
>>> import itertools
>>> itertools.product(xrange(100))
<itertools.product object at 0xb7ed334c>
>>> itertools.product(xrange(1000000))
<itertools.product object at 0xb7ed620c>
>>> itertools.product(xrange(1000000000))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
MemoryError
Note that I'm not even using an infinite iterable, just a really big one. The issue is that creating the iterator fails with a MemoryError, before I've even asked for any values. Consider the following:
for (i, v) in enumerate(itertools.product(a, b, c)):
if i < 1000:
print v
else:
break
When a, b, and c are relatively small, finite iterables, this code works fine. However, if *any* of them are too large (or infinite), we see a MemoryError before the loop even starts, even though only 1000 elements are required. I think it's conceivable that we might want something like "a = itertools.cycle(xrange(5))", and even that will break this loop.
That said, in all such cases I could think of, we can always either truncate big iterators before passing them to product, or use zip/comprehensions to add their values into the tuple (or some combination of those). So maybe it isn't a huge deal.
I've attached my implementation of product which deals with infinite iterators by leveraging enumerate and itertools.cycle, and is pretty much a direct translation of the "odometer" idea. This doesn't support the "repeat" parameter (but probably could using itertools.tee). One thing that should be changed is itertools.cycle shouldn't be called / doesn't need to be called on infinite iterators, but I couldn't figure out how to do that. Maybe there is some way to handle it in the C implementation?)
In summary: the attached implementation of product can accept any mix of infinite / finite iterators, returning a generator intended for partial consumption. The existing itertools.product doesn't work in this case.
----------
nosy: +Sumudu.Fernando
Added file: http://bugs.python.org/file24270/product.py
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10109>
_______________________________________
More information about the Python-bugs-list
mailing list