Regular expressions vs find?

David C. Ullrich ullrich at math.okstate.edu
Sun Jun 18 17:29:01 EDT 2000


William Dandreta <wjdandreta at worldnet.att.net> wrote in article
<oO935.12054$Xx5.557215 at bgtnsc06-news.ops.worldnet.att.net>...
> Hi David,
> 
> The message is titled Python GREP(... you can take a look if you like. It
> was suggested that reg expr might be faster so give it a try.

	Sure enough, someone said "Also you might have better luck 
with the regular expression stuff (rather than the find command) and 
precompiling the string" and it seems clear from the context that
"better luck" means "faster". I doubt it's so. I could be wrong, it's
happened before, not that I can ever think of any examples.
(Note that comparing string.find() to re.search() is not the same
as comparing string.find() to the system's grep function...)

	Ok, try this. I spent a few minutes just now looking up how
a trivial regex works, not gonna look up the docs on profiler.py 
today. Instead I made a huge example that I can "time" by counting
"one, two, three...". You may want to start with less ado at
first and then crank the numbers back up if it goes by too fast:

import string
import re

r=re.compile('dog')

s='ado'*10000

l=[]
for j in range(1000): l.append(s[:])
l.append(s+'g')

def testre():
	localr = r
	for str in l: 
		res = localr.search(str)
		if res: 
			print res.start()
			break


def testfind():
	find=string.find
	for str in l:
		res = find(str,'dog')
		if res > -1:
			print res
			break

I don't _think_ I'm cheating here. On the machine I
tested this on calling testre() seems to take about
six seconds while testfind() seems to be less than 
three.

DU

> Bill
> 
> David C. Ullrich wrote in message
<01bfd93c$66c03180$2ace8ad1 at daves-dell>...
> >
> >
> >William Dandreta <wjdandreta at worldnet.att.net> wrote in article
> ><sy535.11434$Xx5.530580 at bgtnsc06-news.ops.worldnet.att.net>...
> >> I recently read a message that suggested that using regular
expressions
> >> might be faster than the find function.
> >
> > I would have guessed that if anything find would be faster, although
> >I certainly could be wrong. Did someone say that using a regular
expression
> >would be _faster_, or did they actually say it was more _powerful_ or
some
> >such?
> >




More information about the Python-list mailing list