[pypy-issue] [issue1347] Perf problem with many regular expressions
mmaenpaa
tracker at bugs.pypy.org
Sun Dec 1 23:27:24 CET 2013
mmaenpaa <mika.j.maenpaa at iki.fi> added the comment:
It seems that Pypy's jit has real problems when dealing with large number of
regular expressions that are used often. CPython and Pypy without jit
don't have problems and they seem to scale linearly when increasing number of
regular expressions.
Steps to test slowdown when creating and using many regular expressions at the
same time:
1. Create and compile n number of random regular expressions
2. Create predetermined number of random strings
3. Measure time it takes to test all generated strings against all
regular expressions with re.match
When testing with 20000 random strings, Pypy starts to slowdown when dealing
with more than 10 regular expressions. When dealing with more than 25 regular
expressions, Pypy is actually faster with jit disabled.
Attached test program was run with Pypy 2.2.1 and CPython 2.7.3 on 64-bit
Debian Wheezy. Jit was disabled with "--jit off" command line parameter.
$ python run-regexp-bug.py 20000
random strings: 20000
n pypy(s) cpython(s) pypy-nojit(s)
1 0.006 0.010 0.020
5 0.029 0.039 0.065
10 0.054 0.078 0.125
25 0.403 0.197 0.298
50 0.789 0.391 0.616
75 1.405 0.576 0.944
100 2.049 0.763 1.193
200 7.454 1.606 2.459
300 18.311 2.313 3.816
400 30.136 3.079 4.944
500 44.634 3.849 6.282
600 63.083 4.871 8.198
700 79.475 5.488 8.837
800 98.820 6.152 10.070
900 122.998 7.306 11.248
1000 143.714 8.037 12.766
1100 171.839 8.479 13.758
1200 200.731 9.274 15.087
----------
nosy: +mmaenpaa
status: unread -> chatting
________________________________________
PyPy bug tracker <tracker at bugs.pypy.org>
<https://bugs.pypy.org/issue1347>
________________________________________
More information about the pypy-issue
mailing list