[pypy-issue] Issue #2975: Joining long bytes or str sequences is 10-40x slower on PyPy2 and PyPy3 (pypy/pypy)

Andrew Stepanov issues-reply at bitbucket.org
Thu Mar 21 03:28:18 EDT 2019


New issue 2975: Joining long bytes or str sequences is 10-40x slower on PyPy2 and PyPy3
https://bitbucket.org/pypy/pypy/issues/2975/joining-long-bytes-or-str-sequences-is-10

Andrew Stepanov:

I've encountered performance degradation on PyPy2 and PyPy3 with the following code:

```python
import time
import sys

sep = '' if sys.argv[1] == "str" else b''
data = ('0' if sys.argv[1] == "str" else b'0') * int(sys.argv[2])


def workload(num_runs):
    for i in range(num_runs):
        sep.join((data, data))

start = time.time()
workload(200000)
print(time.time() - start)
```

When I try joining bytes or strings in a loop, everything is fine when the sequence is short (100 bytes, for example), PyPy is 3x or 4x times faster.

```bash
$ python3.7 join_test.py str 100
0.04081892967224121
```

```bash
$ python3.7 join_test.py bytes 100
0.03854179382324219
```

```bash
$ pypy3 join_test.py bytes 100
0.010421037673950195
```

```bash
$ pypy3 join_test.py str 100
0.014775991439819336
```
But when I try joining long bytes or strings, PyPy suddenly starts to lag behind:

```bash
$ python3.7 join_test.py bytes 100000
1.3616209030151367
```

```bash
$ python3.7 join_test.py str 100000
1.4790830612182617
```

```bash
$ pypy3 join_test.py bytes 100000
27.6971218585968
```

```bash
$ pypy3 join_test.py str 100000
73.05885100364685
``` 

I've tried adding `--jit off` option to PyPy command line, but it didn't change anything. On Linux machine, the difference is less pronounced (around 6x slower for 100KB version). Same happens on PyPy2. This issue may be related to [this issue](https://bitbucket.org/pypy/pypy/issues/2782/extending-bytearray-is-x30-slower-than). I am using the following versions of PyPy:


```bash
$ pypy -V
Python 2.7.13 (990cef41fe11e5d46b019a46aa956ff46ea1a234, Mar 18 2019, 17:41:49)
[PyPy 7.1.0 with GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)]
```

```bash
$ pypy3 -V
Python 3.5.3 (928a4f70d3de, Feb 08 2019, 10:43:14)
[PyPy 7.0.0 with GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)]
```




More information about the pypy-issue mailing list