much slower then grep on some regular expressions

Filipe Fernandes fernandes.fd at
Fri Jul 4 22:43:11 CEST 2008

On Fri, Jul 4, 2008 at 8:36 AM, Peter Otten <__peter__ at> wrote:
> Henning_Thornblad wrote:
>> What can be the cause of the large difference between and
>> grep?
> grep uses a smarter algorithm ;)
>> This script takes about 5 min to run on my computer:
>> #!/usr/bin/env python
>> import re
>> row=""
>> for a in range(156000):
>>     row+="a"
>> print'[^ "=]*/',row)
>> While doing a simple grep:
>> grep '[^ "=]*/' input                  (input contains 156.000 a in
>> one row)
>> doesn't even take a second.
>> Is this a bug in python?
> You could call this a performance bug, but it's not common enough in real
> code to get the necessary brain cycles from the core developers.
> So you can either write a patch yourself or use a workaround.
>'[^ "=]*/', row) if "/" in row else None
> might be good enough.

Wow... I'm rather surprised at how slow this is... using re.match
yields much quicker results, but of course it's not quite the same as

Incidentally, if you add the '/' to "row" at the end of the string, returns instantly with a match object.

@ Peter
I'm not versed enough in regex to tell if this is a bug or not
(although I suspect it is), but why would you say this particular
regex isn't common enough in real code?


More information about the Python-list mailing list