Faster Regular Expressions
nkipp at vt.edu
nkipp at vt.edu
Thu Mar 9 14:16:28 EST 2000
"""
Food for thought...
My work depends on fast regular expressions but I also enjoy Python's
ease and speed of development.
I ran the following regular expression speed test (P166 workstation
running Linux). The results are below. Function "fastMatch" can be
six times (6x) faster.
It seems to me that Tatu Ylonen (apparent author of regexp.c) did his
job well and that the re/mo wrapper in re.py slows everything down.
-- Neill
[Discuss if you like. Flames---as always---to /dev/null. I post here
because these results should go in the searchable archive.]
"""
import re
TEST = '"coconuts NI! coconuts NI! coconuts NI! coconuts NI! coconuts"'
SLOWQUOTE = re.compile( r"\"(?:(?:\\.)|[^\"\\])*\"")
FASTQUOTE = re.compile( r"\"(?:(?:\\.)|[^\"\\])*\"").code.match
def slowMatch( pattern, string) :
mo = pattern.match( string)
return mo.group()
def fastMatch( pattern, string) :
groups = pattern( string)
start, end = groups[ 0]
return string[ start:end]
def doit() :
for trial in range( 1000) :
x = slowMatch( SLOWQUOTE, TEST)
x = fastMatch( FASTQUOTE, TEST)
import profile
out = 'tmp.prof'
profile.run( 'doit()', out)
import pstats
profObj = pstats.Stats( out)
profObj.sort_stats('cumulative').print_stats()
"""
Thu Mar 9 14:01:35 2000 tmp.prof
5003 function calls in 2.510 CPU seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.010 0.010 2.510 2.510 profile:0(doit())
1 0.000 0.000 2.500 2.500 <string>:1(?)
1 0.220 0.220 2.500 2.500 testre.py:34(doit)
1000 0.410 0.000 1.960 0.002 testre.py:25(slowMatch)
1000 0.640 0.001 0.890 0.001 /usr/local/lib/python1.5/re.py:112(match)
1000 0.660 0.001 0.660 0.001 /usr/local/lib/python1.5/re.py:335(group)
1000 0.320 0.000 0.320 0.000 testre.py:29(fastMatch)
1000 0.250 0.000 0.250 0.000 /usr/local/lib/python1.5/re.py:290(__init__)
0 0.000 0.000 profile:0(profiler)
"""
More information about the Python-list
mailing list