Consume an iterable
Muhammad Alkarouri
malkarouri at gmail.com
Sat Jan 23 08:14:34 EST 2010
On 23 Jan, 12:45, Peter Otten <__pete... at web.de> wrote:
> Muhammad Alkarouri wrote:
> > Thanks everyone, but not on my machine (Python 2.6.1, OS X 10.6) it's
> > not:
>
> > In [1]: from itertools import count, islice
>
> > In [2]: from collections import deque
>
> > In [3]: i1=count()
>
> > In [4]: def consume1(iterator, n):
> > ...: deque(islice(iterator, n), maxlen=0)
> > ...:
> > ...:
>
> > In [5]: i2=count()
>
> > In [6]: def consume2(iterator, n):
> > ...: for _ in islice(iterator, n): pass
> > ...:
> > ...:
>
> > In [7]: timeit consume1(i1, 10)
> > 1000000 loops, best of 3: 1.63 us per loop
>
> > In [8]: timeit consume2(i2, 10)
> > 1000000 loops, best of 3: 846 ns per loop
>
> > Can somebody please test whether it is only my machine or is this
> > reproducible?
>
> I can reproduce it. The deque-based approach has a bigger constant overhead
> but better per-item performance. Its asymptotical behaviour is therefore
> better.
>
> $ python consume_timeit.py
> consume_deque
> 10: 1.77500414848
> 100: 3.73333001137
> 1000: 24.7235469818
>
> consume_forloop
> 10: 1.22008490562
> 100: 5.86271500587
> 1000: 52.2449371815
>
> consume_islice
> 10: 0.897439956665
> 100: 1.51542806625
> 1000: 7.70061397552
>
> $ cat consume_timeit.py
> from collections import deque
> from itertools import islice, repeat
>
> def consume_deque(n, items):
> deque(islice(items, n), maxlen=0)
>
> def consume_forloop(n, items):
> for _ in islice(items, n):
> pass
>
> def consume_islice(n, items):
> next(islice(items, n-1, None), None)
>
> def check(fs):
> for consume in fs:
> items = iter(range(10))
> consume(3, items)
> rest = list(items)
> assert rest == range(3, 10), consume.__name__
>
> if __name__ == "__main__":
> fs = consume_deque, consume_forloop, consume_islice
> check(fs)
>
> items = repeat(None)
>
> from timeit import Timer
> for consume in fs:
> print consume.__name__
> for n in (10, 100, 1000):
> print "%6d:" % n,
> print Timer("consume(%s, items)" % n,
> "from __main__ import consume, items").timeit()
> print
> $
>
> With next(islice(...), None) I seem to have found a variant that beats both
> competitors.
>
> Peter
Thanks Peter, I got more or less the same result on my machine (Python
2.6.1, x86_64, OS X 10.6):
~/tmp> python consume_timeit.py
consume_deque
10: 1.3138859272
100: 3.54495286942
1000: 24.9603481293
consume_forloop
10: 0.658113002777
100: 2.85697007179
1000: 24.6610429287
consume_islice
10: 0.637741088867
100: 1.09042882919
1000: 5.44473600388
The next function performs much better. It is also much more direct
for the purposes of consume and much more understandable (at least for
me) as it doesn't require a specialized data structure which is
subsequently not used as such.
I am thus inclined to report it as a python documentation enhancement
(bug) request. Any comments?
Cheers,
Muhammad
More information about the Python-list
mailing list