[Python-Dev] New regex module for 3.2?
MRAB
python at mrabarnett.plus.com
Fri Jul 9 21:35:16 CEST 2010
Collin Winter wrote:
> On Fri, Jul 9, 2010 at 10:28 AM, MRAB <python at mrabarnett.plus.com> wrote:
>> anatoly techtonik wrote:
>>> On Thu, Jul 8, 2010 at 10:52 PM, MRAB <python at mrabarnett.plus.com> wrote:
>>>> Hi all,
>>>>
>>>> I re-implemented the re module, adding new features and speed
>>>> improvements. It's available at:
>>>>
>>>> http://pypi.python.org/pypi/regex
>>>>
>>>> under the name "regex" so that it can be tried alongside "re".
>>>>
>>>> I'd be interested in any comments or feedback. How does it compare with
>>>> "re" in terms of speed on real-world data? The benchmarks suggest it
>>>> should be faster, or at worst comparable.
>>> And where are the benchmarks?
>>> In particular it would be interesting to see it compared both to re
>>> from stdlib and re2 from http://code.google.com/p/re2/
>>>
>> The benchmarks bm_regex_effbot.py and bm_regex_v8.py both perform
>> multiple runs of the tests multiple times, giving just the total times
>> for each set. Here are the averages:
>>
>> Python26
>> BENCHMARK re regex ratio
>> bm_regex_effbot 0.135secs 0.083secs 1.63
>> bm_regex_v8 0.153secs 0.085secs 1.80
>>
>>
>> Python31
>> BENCHMARK re regex ratio
>> bm_regex_effbot 0.138secs 0.083secs 1.66
>> bm_regex_v8 0.170secs 0.091secs 1.87
>
> Out of curiosity, what are the results for the bm_regex_compile benchmark?
>
I concentrated my efforts on the matching speed because regexes tend to
be compiled only once, and are cached anyway, so I don't think it's as
important. The results are:
Python26
BENCHMARK re regex ratio
bm_regex_compile 0.897secs 2.792secs 0.32
Python31
BENCHMARK re regex ratio
bm_regex_compile 0.902secs 2.731secs 0.33
If anyone can demonstrate that it'll have a significant impact in
practice then I will, of course, look into it further.
More information about the Python-Dev
mailing list