[Python-ideas] An exhaust() function for iterators
Clay Sweetser
clay.sweetser at gmail.com
Sun Sep 29 06:06:47 CEST 2013
Currently, several strategies exist for exhausting an iterable when
one does not care about what the iterable returns (such as when one
merely wants a side effect of the iteration process).
One can either use an empty for loop:
for x in side_effect_iterable:
pass
A throwaway list comprehension:
[x for x in side_effect_iterable]
A try/except and a while:
next = side_effect_iterable.next
try:
while True:
next()
except StopIteration:
pass
Or a number of other methods.
The question is, which one is the fastest? Which one is the most
memory efficient?
Though these are all obvious methods, none of them are both the
fastest and the most memory efficient (though the for/pass method
comes close).
As it turns out, the fastest and most efficient method available in
the standard library is collections.deque's __init__ and extend
methods.
from collections import deque
exhaust_iterable = deque(maxlen=0).extend
exhaust_iterable(side_effect_iterable)
When a deque object is initialized with a max length of zero or less,
a special function, consume_iterator, is used instead of the regular
element insertion calls.
This function, found at
http://hg.python.org/cpython/file/tip/Modules/_collectionsmodule.c#l278,
merely iterates through the iterator, without doing any work
allocating the object to the deque's internal structure.
I would like to propose that this function, or one very similar to it,
be added to the standard library, either in the itertools module, or
the standard namespace.
If nothing else, doing so would at least give a single *obvious* way
to exhaust an iterator, instead of the several miscellaneous methods
available.
--
"Evil begins when you begin to treat people as things." - Terry Pratchett
More information about the Python-ideas
mailing list