[issue2636] Regexp 2.7 (modifications to current re 2.2.2)

John Machin report at bugs.python.org
Wed Aug 12 05:00:22 CEST 2009


John Machin <sjmachin at users.sourceforge.net> added the comment:

What is the expected timing comparison with re? Running the Aug10#3
version on Win XP SP3 with Python 2.6.3, I see regex typically running
at only 20% to %50 of the speed of re in ASCII mode, with
not-very-atypical tests (find all Python identifiers in a line, failing
search for a Python identifier in an 80-byte text). Is the supplied
_regex.pyd from some sort of debug or unoptimised build? Here are some
results:

dos-prompt>\python26\python -mtimeit -s"import re as
x;r=x.compile(r'[A-Za-z_][A-Za-z0-9_]+');t='    def __init__(self, arg1,
arg2):\n'" "r.findall(t)"
100000 loops, best of 3: 5.32 usec per loop

dos-prompt>\python26\python -mtimeit -s"import regex as
x;r=x.compile(r'[A-Za-z_][A-Za-z0-9_]+');t='    def __init__(self, arg1,
arg2):\n'" "r.findall(t)"
100000 loops, best of 3: 12.2 usec per loop

dos-prompt>\python26\python -mtimeit -s"import re as
x;r=x.compile(r'[A-Za-z_][A-Za-z0-9_]+');t='1234567890'*8" "r.search(t)"
1000000 loops, best of 3: 1.61 usec per loop

dos-prompt>\python26\python -mtimeit -s"import regex as
x;r=x.compile(r'[A-Za-z_][A-Za-z0-9_]+');t='1234567890'*8" "r.search(t)"
100000 loops, best of 3: 7.62 usec per loop

Here's the worst case that I've found so far:

dos-prompt>\python26\python -mtimeit -s"import re as
x;r=x.compile(r'z{80}');t='z'*79" "r.search(t)"
1000000 loops, best of 3: 1.19 usec per loop

dos-prompt>\python26\python -mtimeit -s"import regex as
x;r=x.compile(r'z{80}');t='z'*79" "r.search(t)"
1000 loops, best of 3: 334 usec per loop

See Friedl: "length cognizance". Corresponding figures for match() are
1.11 and 8.5.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2636>
_______________________________________


More information about the Python-bugs-list mailing list