[pypy-issue] Issue #2135: itertools.islice performance issue (pypy/pypy)

Vincent Michel issues-reply at bitbucket.org
Wed Sep 9 15:10:15 CEST 2015

New issue 2135: itertools.islice performance issue

Vincent Michel:

The RPython version of [itertools.slice](https://docs.python.org/2/library/itertools.html#itertools.islice) is about 6 times slower than the pure python version.

Example with a function to get the last 9 digits of the Nth term in the Fibonacci sequence:

from itertools import islice

def fib(a=1, b=1, m=10**9):
    while True:
        yield b
        a, b = b, (a + b) % m

def f1(n):
    return next(islice(fib(), n, None))

def f2(n):
    return next(x for i, x in enumerate(fib()) if i==n)

assert f1(100) == f2(100)

Timing tests (f1 uses islice and f2 uses enumerate):

$ ./pypy -V
Python 2.7.10 (f3ad1e1e1d62, Aug 28 2015, 10:45:29)
[PyPy 2.6.1 with GCC 4.8.4]
$ ./pypy -m timeit -s "import test_islice" -n 10 'test_islice.f1(1000000)'
10 loops, best of 3: 112 msec per loop
$ ./pypy -m timeit -s "import test_islice" -n 10 'test_islice.f2(1000000)'
10 loops, best of 3: 17 msec per loop

Same tests with CPython:

$ python -V
Python 2.7.6
$ python -m timeit -s "import test_islice" -n 10 'test_islice.f1(1000000)'
10 loops, best of 3: 82.3 msec per loop
$ python -m timeit -s "import test_islice" -n 10 'test_islice.f2(1000000)'
10 loops, best of 3: 117 msec per loop

I also tried with other pure python implementations of islice:

[From the official documentation](https://docs.python.org/2/library/itertools.html#itertools.islice)
def islice(iterable, *args):
    s = slice(*args)
    it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1))
    nexti = next(it)
    for i, element in enumerate(iterable):
        if i == nexti:
            yield element
            nexti = next(it)

[From older version of PyPy](https://github.com/MichaelBlume/pypy/blob/master/lib_pypy/itertools.py)
class islice(object):

    def __init__(self, iterable, *args):
        s = slice(*args)
        self.start, self.stop, self.step = s.start or 0, s.stop, s.step
        if not isinstance(self.start, (int, long)):
           raise ValueError("Start argument must be an integer")
        if self.stop is not None and not isinstance(self.stop, (int,long)):
           raise ValueError("Stop argument must be an integer or None")
        if self.step is None:
            self.step = 1
        if self.start<0 or (self.stop is not None and self.stop<0
           ) or self.step<=0:
            raise ValueError, "indices for islice() must be positive"
        self.it = iter(iterable)
        self.donext = None
        self.cnt = 0

    def __iter__(self):
        return self

    def next(self): 
        if self.donext is None:
                self.donext = self.it.next
            except AttributeError:
                raise TypeError
        nextindex = self.start
        if self.stop is not None and nextindex >= self.stop:
            raise StopIteration
        while self.cnt <= nextindex:
            nextitem = self.donext()
            self.cnt += 1
        self.start += self.step 
        return nextitem

And they both run as fast as f2 (pure python using enumerate).


More information about the pypy-issue mailing list