[pypy-issue] Issue #2135: itertools.islice performance issue (pypy/pypy)
Vincent Michel
issues-reply at bitbucket.org
Wed Sep 9 15:10:15 CEST 2015
New issue 2135: itertools.islice performance issue
https://bitbucket.org/pypy/pypy/issues/2135/itertoolsislice-performance-issue
Vincent Michel:
The RPython version of [itertools.slice](https://docs.python.org/2/library/itertools.html#itertools.islice) is about 6 times slower than the pure python version.
Example with a function to get the last 9 digits of the Nth term in the Fibonacci sequence:
```
#!python
from itertools import islice
def fib(a=1, b=1, m=10**9):
while True:
yield b
a, b = b, (a + b) % m
def f1(n):
return next(islice(fib(), n, None))
def f2(n):
return next(x for i, x in enumerate(fib()) if i==n)
assert f1(100) == f2(100)
```
Timing tests (f1 uses islice and f2 uses enumerate):
```
#!shell
$ ./pypy -V
Python 2.7.10 (f3ad1e1e1d62, Aug 28 2015, 10:45:29)
[PyPy 2.6.1 with GCC 4.8.4]
$ ./pypy -m timeit -s "import test_islice" -n 10 'test_islice.f1(1000000)'
10 loops, best of 3: 112 msec per loop
$ ./pypy -m timeit -s "import test_islice" -n 10 'test_islice.f2(1000000)'
10 loops, best of 3: 17 msec per loop
```
Same tests with CPython:
```
#!shell
$ python -V
Python 2.7.6
$ python -m timeit -s "import test_islice" -n 10 'test_islice.f1(1000000)'
10 loops, best of 3: 82.3 msec per loop
$ python -m timeit -s "import test_islice" -n 10 'test_islice.f2(1000000)'
10 loops, best of 3: 117 msec per loop
```
I also tried with other pure python implementations of islice:
[From the official documentation](https://docs.python.org/2/library/itertools.html#itertools.islice)
```
#!python
def islice(iterable, *args):
s = slice(*args)
it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
nexti = next(it)
for i, element in enumerate(iterable):
if i == nexti:
yield element
nexti = next(it)
```
[From older version of PyPy](https://github.com/MichaelBlume/pypy/blob/master/lib_pypy/itertools.py)
```
#!python
class islice(object):
def __init__(self, iterable, *args):
s = slice(*args)
self.start, self.stop, self.step = s.start or 0, s.stop, s.step
if not isinstance(self.start, (int, long)):
raise ValueError("Start argument must be an integer")
if self.stop is not None and not isinstance(self.stop, (int,long)):
raise ValueError("Stop argument must be an integer or None")
if self.step is None:
self.step = 1
if self.start<0 or (self.stop is not None and self.stop<0
) or self.step<=0:
raise ValueError, "indices for islice() must be positive"
self.it = iter(iterable)
self.donext = None
self.cnt = 0
def __iter__(self):
return self
def next(self):
if self.donext is None:
try:
self.donext = self.it.next
except AttributeError:
raise TypeError
nextindex = self.start
if self.stop is not None and nextindex >= self.stop:
raise StopIteration
while self.cnt <= nextindex:
nextitem = self.donext()
self.cnt += 1
self.start += self.step
return nextitem
```
And they both run as fast as f2 (pure python using enumerate).
More information about the pypy-issue
mailing list