re.search much slower then grep on some regular expressions
Kris Kennaway
kris at FreeBSD.org
Tue Jul 8 09:58:31 EDT 2008
samwyse wrote:
> On Jul 4, 6:43 am, Henning_Thornblad <Henning.Thornb... at gmail.com>
> wrote:
>> What can be the cause of the large difference between re.search and
>> grep?
>
>> While doing a simple grep:
>> grep '[^ "=]*/' input (input contains 156.000 a in
>> one row)
>> doesn't even take a second.
>>
>> Is this a bug in python?
>
> You might want to look at Plex.
> http://www.cosc.canterbury.ac.nz/greg.ewing/python/Plex/
>
> "Another advantage of Plex is that it compiles all of the regular
> expressions into a single DFA. Once that's done, the input can be
> processed in a time proportional to the number of characters to be
> scanned, and independent of the number or complexity of the regular
> expressions. Python's existing regular expression matchers do not have
> this property. "
Very interesting! Thanks very much for the pointer.
Kris
More information about the Python-list
mailing list