Most efficient method to search text?
mwh at python.net
Thu Oct 17 13:14:09 CEST 2002
Tim Peters <tim.one at comcast.net> writes:
> Especially for purposes of building lexers, it might be useful if the re
> package could recognize when a DFA approach was sufficient and practical,
> and switch to a different scheme entirely then. Or it might not. Build
> code to try it both ways, and let us know how it turns out ...
Indeed, my code canes re when the wordlist gets long. Here's some
numbers (do_comp(n) builds a list of n words composed of ten random
ascii characters and searches the Python standard library for them
using first my code, then a re-based approach):
>>> import robin
So roughly linear behaviour from re, n-independent behaviour from my
code. Isn't it nice when things do what you expect?
On the other hand, the "compiler" code I posted yesterday consumes
vast gobs of memory -- trying to compile a 10000 long list of random
words got the OOM killer into action. A version using lists seems
slightly better in this respect, though slower in execution.
Have you considered downgrading your arrogance to a reasonable level?
-- Erik Naggum, comp.lang.lisp, to yet another C++-using troll
More information about the Python-list