[Python-Dev] re performance
lukasz at langa.pl
Wed Feb 1 14:42:56 EST 2017
> On Jan 31, 2017, at 11:40 AM, Wang, Peter Xihong <peter.xihong.wang at intel.com> wrote:
> Regarding to the performance difference between "re" and "regex" and packaging related options, we did a performance comparison using Python 3.6.0 to run some micro-benchmarks in the Python Benchmark Suite (https://github.com/python/performance <https://github.com/python/performance>):
> Results in ms, and the lower the better (running on Ubuntu 15.10)
> re regex (via pip install regex, and a replacement of "import re" with "import regex as re")
> bm_regex_compile.py 229 298
> bm_regex_dna.py 171 267
> bm_regex_effbot.py 2.77 3.04
> bm_regex_v8.py 24.8 14.1
> This data shows "re" is better than "regex" in term of performance in 3 out of 4 above micro-benchmarks.
This is very informative, thank you! This clearly shows we should rather pursue the PyPI route (with a documentation endorsement and possible bundling for 3.7) than full-blown replacement.
However, this benchmark is incomplete in the sense that it only checks the compatibility mode of `regex`, whereas it's the new mode that lends the biggest performance gains. So, providing checks for the other engine would show us the full picture. We'd need to add checks that prove the regular expressions in said benchmarks end up with equivalent matches, to be sure we're testing the same thing.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev