Most efficient method to search text?

Michael Hudson mwh at
Fri Oct 18 12:03:10 CEST 2002

bokr at (Bengt Richter) writes:

> On Thu, 17 Oct 2002 11:14:09 GMT, Michael Hudson <mwh at> wrote:
> >Tim Peters < at> writes:
> >
> >> Especially for purposes of building lexers, it might be useful if the re
> >> package could recognize when a DFA approach was sufficient and practical,
> >> and switch to a different scheme entirely then.  Or it might not.  Build
> >> code to try it both ways, and let us know how it turns out ...
> >
> >Indeed, my code canes re when the wordlist gets long.  Here's some
> If this is easy to add to your test harness, I'd be interested to see what
> this search does in comparison, with the longer word lists (it's probably
> faster than that I posted elsewhere, depending on relative lengths
> of word lists and strings, and internal vs external looping and allocation.

It's quick:

>>> robin.do_comp(1000)
compile...       3.42289197445
compile2...      2.49848008156
compile3...      0.696313977242
compile4...      4.04265594482
compile_re...    0.627331018448
compile_bengt... 0.00175499916077

test...          1.39854204655
test2...         2.93543899059
test3...         3.2231388092
test4...         2.15867292881
test_re...       8.38554108143
test_bengt...    0.437232971191

I'm sure Tim once said something along the lines of "Python doesn't
give much advice for getting good performance, beyond a not-so-subtle
hint to exploit dicts for all they're worth" but I can't find it now.

hmm, look at that sig...

  Premature optimization is the root of all evil.
       -- Donald E. Knuth, Structured Programming with goto Statements

More information about the Python-list mailing list