much slower then grep on some regular expressions

Peter Otten __peter__ at
Sat Jul 5 07:58:14 CEST 2008

John Nagle wrote:

> Henning_Thornblad wrote:
>> What can be the cause of the large difference between and
>> grep?
>> This script takes about 5 min to run on my computer:
>> #!/usr/bin/env python
>> import re
>> row=""
>> for a in range(156000):
>>     row+="a"
>> print'[^ "=]*/',row)
>> While doing a simple grep:
>> grep '[^ "=]*/' input                  (input contains 156.000 a in
>> one row)
>> doesn't even take a second.
>> Is this a bug in python?
>> Thanks...
>> Henning Thornblad
>     You're recompiling the regular expression on each use.
> Use "re.compile" before the loop to do it once.

Now that's premature optimization :-)

Apart from the fact that is executed only once in the above
script the re library uses a caching scheme so that even if the
call were in a loop the overhead would be a few microseconds for the cache


More information about the Python-list mailing list