[Tutor] regular expression

Rich Krauter rmkrauter at yahoo.com
Fri Feb 13 09:05:22 EST 2004


> On Thu, 2004-02-12 at 20:50, Conrad Koziol wrote:
> > What is the fastest way to search for a string and then surround

Hi Conrad,

I tried timing the two methods proposed so far. I figured the longer
pattern would take much longer.

This is what I got on my pc:

1st method(longer regex) - 1000000 times, 74.0188100338 seconds
2nd method(short regex)- 1000000 times, 67.2792310715 seconds

This is how I got those numbers. 

import re
# requires 2.3
from timeit import Timer

x = '<div> are not allowed, these arent either <br>'
patx = r'(?<!<code>)((?!<.*code>)<[^<>]*>)(?!</code>)'
paty = r'(<[^<>]*>)'
rep = r'<code>\1</code>'

print Timer(stmt='(a,na)=re.subn(patx,rep,x)',
     setup='import re;from __main__ import patx,paty,rep,x').timeit()
print Timer(stmt='(aa,naa)=re.subn(paty,rep,x)',
    setup='import re;from __main__ import patx,paty,rep,x').timeit()

Maybe this gives you an idea of how to test the relative speed of
whatever solution you come up with.

Rich




More information about the Tutor mailing list