re.search much slower then grep on some regular expressions
__peter__ at web.de
Sat Jul 5 07:58:14 CEST 2008
John Nagle wrote:
> Henning_Thornblad wrote:
>> What can be the cause of the large difference between re.search and
>> This script takes about 5 min to run on my computer:
>> #!/usr/bin/env python
>> import re
>> for a in range(156000):
>> print re.search('[^ "=]*/',row)
>> While doing a simple grep:
>> grep '[^ "=]*/' input (input contains 156.000 a in
>> one row)
>> doesn't even take a second.
>> Is this a bug in python?
>> Henning Thornblad
> You're recompiling the regular expression on each use.
> Use "re.compile" before the loop to do it once.
Now that's premature optimization :-)
Apart from the fact that re.search() is executed only once in the above
script the re library uses a caching scheme so that even if the re.search()
call were in a loop the overhead would be a few microseconds for the cache
More information about the Python-list